J. FRENKEL 


WAVE MECHANICS 


J. FRENKEL 


WAVE MECHANICS 


ADVANCED GENERAL THEORY 


PREFACE 


HE present volume forming the second Part of my Wave Mechanics 

is devoted (as foreshadowed in the Preface to Part I) to the mathe- 
matical development of the general ideas underlying the new mechanics, 
connecting it with classical mechanics and constituting it a complete 
self-supporting theory. In building up the mathematical framework of 
this theory I have limited myself to what I consider its most essen- 
tial elements, leaving aside a number of questions which have a metho- 
dological value only (such as the group theory) or which are met with 
in the solution of special problems. 

It is my intention to consider some of these questions later on in 
connexion with the special problems which will be discussed in Part III 
(‘Advanced Special Theory’) ; I have carefully avoided complicating the 
general scheme of the theory by such special questions—with a few 
exceptions inserted for illustration (the relativistic theory of the hydro- 
gen-like atom, for example). 

To make the general scheme more comprehensible I have not spared 
space, dealing with especially important general questions (such as the 
transformation and the perturbation theory, or the relativistic theory 
of the electron) at much greater length than would be necessary from 
the point of view of an adequate presentation to a sophisticated reader. 

I must cordially thank the editors for their readiness to meet my de- 
mands on space, which have resulted in a book larger than was originally 
contemplated. I must also thank M. L. Urquhart and Miss B. Swirles 
for help in correcting the English and the proofs. 

The present book, like Part I, is complete in itself, and can be read 
without acquaintance with Part I, provided the reader is familiar with 
some elementary account of wave mechanics, and is ready to explore 
its mathematical depths to obtain a profounder insight into the theory 
and to prepare himself for applying it to various special problems. 

The earlier portions of this book were written in 1931 while I was in 
America; it was completed in Leningrad nearly two years later. Some 
of the shortcomings of the book are due to this interruption and the 
impossibility of revising it in 1933 from the very beginning. 

A list of the more important references for each section is given at the 
end of the book ; it is followed by a short index which should enable the 
reader to locate easily all the more important subjects treated. 


LENINGRAD J. F. 
Nov. 1933 


CONTENTS 


I. CLASSICAL MECHANICS AS THE LIMITING FORM OF WAVE 
MECHANICS 
1. Motion in One Dimension ; Partial Reflection and Uncertainty in 
the Sign of the Velocity 
2. Comparison between the Schrédinger and the Classical Equation 
of Motion in One Dimension; Avcrage Velocity and Current 


Density , 

3. Generalization for N on-atationary Motion i in Thr ce Dimensions 
The Hamilton-Jacobi Equation 

4. Comparison of the Approximate Solutions of Schrédinger’s 8 
Equation; Comparison of Classical and Wave-mechanical 
Average Values 

5. Motion in a Limited Region ; Quantan Conditions and Average 
Values ‘ é ‘ : : . 


II. OPERATORS 

6. Operational Form of Schrédinger’s Equation, and Operational 
Representation of Physical Quantitics ‘ 

7. Characteristic Functions and Values of Operators ; Operational 
Equations: Constants of the Motion . 

8. Probable Values of Physicul Quantitics and their Change with 
the Time : 

9. Tho Variational Form of the Schrédinger Equation and its 
Application to the Perturbation Theory F : 

10. Orthogonality and Normalization of Characteristic F unotions for 

Discrete and Continuous Spectra st. ‘ ° ° 


Ill. MATRICES 
11. Matrix Representation of Physical Quantities and Matrix Form 
of the Equations of Motion . 
12. The Correspondence between Matrix and Classical Mechanics 
13. Application of the Matrix Method to Oscillatory and Rotational 
Motion ‘ 
14. Matrix Representation i in the Caso of a Continuous Spectrum 


IV. TRANSFORMATION THEORY 
15. Restricted Transformation iii Matrices defined from differ- 
ent ‘Points of View’ : 
16. Transformation of Matrices 
17. Transformation Theory of Matrices as a . Generalization of Wave 
Mechanics ;x Transformation of Basic Quantities 
18. Geometrical Representation of the Transformation Theory 


V. PERTURBATION THEORY 
19. Perturbation Theory not involving the Time (Method of Station- 
ary States) . 
20. Extension of the Preceding Theory to the Case of ‘Relative 
Degeneracy’ and Continuous Spectra; Effect of Perturbation 
on Various Physica] Quantities ‘ ‘ ‘ . 


106 
120 


127 
138 


148 
162 


177 


189 


CONTENTS 


21. Perturbation Theory involving the Time; General Processes ; 
Theory of Transitions : , 

22. First Approximation ; Theory of Simple Trensitions 

23. Second Approximation ; Theory of Combined Transitions 

24. Theory of Transitions for an Undefined Initial State 


VI. RELATIVISTIC REMODELLING AND MAGNETIC GENERALI- 


VII. 


ZATION OF THE WAVE MECHANICS OF A SINGLE 
ELECTRON 

25. Simplest Form of Relativistic Wave Mechanics. 

26. Magnetic Forces in the eat ee Non- Relativiatio Wave 
Mcchanies 

27. Relativistic Wave Mechanics as a Poemal Gcrieralieation of 
Maxwell’s Electromagnetic Theory of Light ‘ 

28. Alternative Form of the Wave Equations; Duplicity and Geis. 
ruplicity Phenomenor 

29. Pauli’s Approximate Theory in the Tw: Giueiional Watele 
Form; the Electron’s Magnetic Moment and Angular Mo- 
mentum 

30. More Exact Form of the Two- Siensudonsl Matrix Theory hie 
Electron’s Electric Moment . 

31. The Exact Four-dimensional Matrix Theory of i 

32. Genera] Treatment. of the aa Effect ; Angular Momentum and 
Magnetic Moment 

33. The Motion of an Electron in a Canten Field of Heese Fine 
Structure and Zeeman Effect. ; 

34. Negative Energy States; Positive Electrons end N watrons 

35. The Invariance of the Dirac Equation with regard to Coordinate 
Transformations 

36. Transformation of the Dirac Pauation to Cusvilinnes Coordi 
nates 


THE PROBLEM OF MANY PARTICLES 

37. General Results. Virial Theorem, Linear and Angular Momentum 

38. Magnetic Forces and Spin Effects 

39. Complex Particles treated as Material Points with Tones Coordi- 
nates ; Theory of Incomplete Systems : 

40. Identical Particles (Electrons) and the Exclusion Principle 


REDUCTION OF THE PROBLEM OF A SYSTEM OF IDENTI- 
CAL PARTICLES TO THAT OF A SINGLE PARTICLE 

41. Perturbation Theory of a System of Spinless Electrons and the 
Exchange Degeneracy 

42. Introduction of the Spin Goredinalon mid Solution of ‘the Fix. 
turbation Problem with Antisymmetrical Wave Functions 

43. The Method of the Self-consistent Field with Factorized Wave 
Functions 

44. The Method of the Self. couvattentt Field with Antioymmetricl 
Functions and Dirac’s Density Matrix 

45. Approximate Solutions (Thomas-Fermi-Dirac Begnatinn) 


214 
226 


386 
392 


400 


410 


423 


428 
439 


CONTENTS 


IX. SECOND (INTENSITY) QUANTIZATION AND QUANTUM 
ELECTRODYNAMICS 

46. Second Quantization with respect to Electrons . 

47. Intensity Quantization of Particles described in the Configura- 
tion Space by a Symmetrical Wave Function see Bose 
Statistics) . 

48. Interaction between a ‘Doubly Quantized’ System and an Ordin- 
ary System: Application to Photons . 

49. Electromagnetic Waves with Quantized Amplitudes; Thoory of 
Spontancous Transitions and of Radiation Damping . 

60. Application of Quantized Electron Waves to the Emission and 
Scattering of Radiation 

51. Connexion between Quantized Meohanival (Electron) Waves and 
Electromagnetic Waves 

52. The Quantum Electrodynamics of Heisenberg, Puli, an Dirac 

53. Breit’s Formula. Concluding Remarks ; . ‘ 


REFERENCES 
INDEX TO PART I 
INDEX TO PART II 


447 


462 
474 
484 
494 


502 
506 
512 


519 
523 
525 


ADVANCED GENERAL THEORY 


I 


CLASSICAL MECHANICS AS THE LIMITING FORM 
OF WAVE MECHANICS 


1. Motion in One Dimension; Partial Reflection and Uncertainty 
in the Sign of the Velocity 

In the first part of this book we have given a general outline of the 
development and present state of wave mechanics, emphasizing the 
physical meaning of the new conceptions and avoiding, as far as pos- 
sible, formal questions connected with the mathematical expression of 
these new conceptions. We have thus been led astray from the old 
conceptions based on classical corpuscular mechanics, deepening, as it 
were, the abyss separating the old from the new mechanics. 

A systematic study of the formal questions referred to above reveals 
the wonderful fact that in spite of the fundamental physical difference 
between the new and the old mechanics, they are extremely similar 
from the mathematical point of view, i.e. from the point of view of 
the mathematical expression of the various physical quantities and the 
mathematical equations connecting them. This formal similarity forms 
a bridge over the abyss between the old and the new mechanics, 
enabling one to consider the latter as an extension or rather a refine- 
ment of the former and to establish a one-to-one correspondence 
between the old ‘classical’ and the new ‘quantum’ conceptions, quan- 
tities, and equations—a correspondence which often looks like an 
identity. 

The existence of such a correspondence is a very instructive example 
of the fact—many times already illustrated by the development of 
physics—that a drastic revision of our physical conceptions can be 
associated with a simple improvement in the underlying mathematical 
scheme. 

We shall start by considering the simplest case of the wave-mechanical 
equation, i.e. the equation describing the stationary motion of a particle 
in one diménsion: ° 

a+ a (WU = 0, (l) 


the potential energy U being supposed to depend on z only (and not 
upon f, otherwise the total energy W would not be constant). 
3595.6 B 


2 CLASSICAL MECHANICS AS LIMITING FORM $1 
If U were constant, then this equation would have a solution of the 


aaa p= Actor (1a) 


representing a sine wave travelling in the direction of the positive z-axis, 
a being the positive square root of the expression 87?m(W— U)/h® (sup- 
posed to be positive). It must be borne in mind, however, that (14) is 
only a particular solution of (1), the general solution being 
yb = A’eter+4 A” e-taz, (1b) 
which represents the superposition of two sine waves of the same length 
travelling in opposite directions. The fact that (1) has two independent 
particular solutions, representing, under the condition U = const., 
waves travelling in opposite directions, and that its general solution is 
equal to the sum of these two, is a consequence of the fact that (1) is 
a linear equation of the second order. 
In the general case, either for a constant or a variable U(x), the 
function y, which is a compler quantity, can be written in the form 
yp = Ae’, - (2) 
where A = || is its modulus and ¢ is its argument (both of them of 
course being real). This representation of ¥ suggests that it may be 
possible to interpret the process described by it in a way similar to 
that corresponding to expression (1a), namely, as a propagation of a 
wave with a (variable) amplitude A(z) in a definite direction specified 
by the phase ¢(z) (positive if d¢/dx > 0 and negative if dé/dx < 0). 
Such an interpretation is, however, in general wrong, as is clearly 
shown by taking for % the expression (1 b) corresponding to U = const. 
Assuming A’ and A” to be real, we get in this case 


Acos¢ = (A’+A”)cos az, Asing = (A’—A”)sin oz, 
and consequently 


A? = A’+4A"42A’A" cos 2az, (2a) 
A’—A" 
tang = ae (2b) 


The functions A and ¢ can, of course, be interpreted as the resulting 
amplitude and phase at the various points, but they will not refer to 
oscillations propagated in one definite direction. It will be noticed that 
A, instead of being constant, may oscillate with x twice as rapidly as 
the phase of each of the two component waves, and that the resulting 
phase ¢ may alternately increase and decrease with increase of z. 


t+ We shall drop in future the time factor e-!""*/4, the oscillatory character of ~ as 
a function of the time being understood. 


§1 MOTION IN ONE DIMENSION 3 
Substituting (2) in (1) and taking into account the relations 


zt = i arian, 
ayy ‘s i a: dd ., ah . 
sae ip i $4 54- 7 ei 
dx? dx ® ahs dx iia (Z) eres dzt *, 


we get, after an the common factor e’?, 
d?A dd oA dg d*g\ _ 
it |" - (#) |4 +(e tA ga) ~ 


Because A, ¢, and the parameter 


at = (WU) (3) 
are all real quantities, this equation can be split up into two equations: 
@A[ . (a) 4 _ 
ast | (@) [A= une 
ofA dd , dd 
a ie Ati le (Sh) 


If the latter equation be divided by Add¢/dz, it can be immediately 
integrated to give de 
2log A + log = const. 


dd 
27 — = 
or A a C ( = const.). (4) 
Putting d¢/dz = C/A? in (3a), we get 
a 
To +(e C2A-4)A = 0. (4a) 


This equation for A = ||? is equivalent to the Schridinger equation 
(1) for %, but differs from it formally by the fact that it is not linear. 
Let us assume for a moment that Schrédinger’s equation, in the case 
of a variable U, admits of a particular solution of the type (la), 
i.e. a solution representing waves travelling in one definite direction, 
e.g. in the positive direction. We could then obviously identify A in 
(2) with the amplitude and ¢ with the phase of these particular waves. 
According to the definition of phase, the change of phase corresponding 
to an increase of x by dx (the time being fixed) would be given in this 


case by 
dp = = de = a de 


(A denoting the wave-length at the point considered). 


4 CLASSICAL MECHANICS AS LIMITING FORM §] 
We should thus have the equation 


- (5) 


dz *’ 

which is inconsistent with (3 a) unless d2A /dz? == 0. This condition, giving 
A = ax-+5, is, however, in general inconsistent with the relation (4a), 
‘ C 

ie. A2a = C, unless « = (axtb)” 
tion for the potential-energy function U (the preceding relation is ful- 
filled in particular if U = const., a being equal to zero in this case). 
We thus see that a one-sided wave propagation, corresponding to the 
motion of a particle in one definite direction, is in general] impossible. 

From the point of view of the wave conception this result is very 
easily explained. Thus every field of force, i.e. every inhomogeneity in 
the potential energy U or the parameter a, leads to a partial reflection 
of a wave impinging on it. If the inhomogeneity is due to a discon- 
tinuous jump of a, the reflection is produced at the point (or plane) of 
discontinuity. If « varies continuously, the reflection is produced 
gradually (the reflected waves giving rise to reflected waves of the 
second order travelling in the initial direction, and so on). 

From the corpuscular point of view this means that a particle moving 
along the axis of x in a field of force parallel to z may have its velocity 
reversed at every instant, so that while the magnitude of the velocity is 
a given function of z, its direction or sign remains uncertain. 

This uncertainty constitutes the fundamental difference between the 
new and the old mechanics. In the old mechanics, if the direction of 
the velocity is fixed at some initial instant, then it should remain the 
same so long as the kinetic energy W—U remains positive (a? > 0). 
Such a determinateness does not actually exist in the phenomena of 
motion. When these phenomena are described by wave mechanics, we 
find Nature in a position very similar to that of a theoretical physicist 
who, in performing complicated (and even simple!) calculations, often 
feels a strong uncertainty about the sign (+ or —) which must be 
assigned to the quantities under consideration. 

This uncertainty of sign or of direction of velocity for a given magni- 
tude of the latter and a given position can be regarded as an ‘uncertainty 
principle’ characteristic of wave mechanics and not related directly to 
the uncertainty principle of Heisenberg. The difference between them is 
that in the latter the localization of the particle is imagined to be effected 
by means of a ‘wave packet’ involving an uncertainty not so much 


which means a very special assump- 


$1 MOTION IN ONE DIMENSION 5 
in the direction of the velocity as in its magnitude, whereas in the 
present case there is no need for constructing such a packet, the fact 
asserted being not a definite position of the particle, but the connexion 
between position, which may be arbitrary (that is, specifiable in terms 
of probability only) and the magnitude of the velocity. As we have 
just seen, the uncertainty in the direction of this velocity is connected 
with the possibility of both transmission and reflection of the particle 
in every region where it is acted on bysome force. At the very beginning 
of this book we came upon this possibility when attempting to interpret, 
from the corpuscular point of view, the phenomena of partial reflection 
and partial transmission of light at the boundary between two homo- 
geneous bodies. Later we studied it in more detail when investigating 
the motion of material particles in a field of force according to wave 
mechanics. We can sum up the results arrived at by saying that the 
indeterminateness which constitutes the characteristic distinction be- 
tween wave mechanics and classical mechanics is due primarily to this 
ambiguity in the result produced by a force acting on the particle. 
Whereas in classical mechanics such a force must either accelerate or 
retard the particle, reversing the direction of its motion only when the 
increase of potential energy would exceed the total energy, in wave 
mechanics a force can reverse the direction of motion, leaving the 
magnitude of the velocity unchanged, even when this force is acting 
in the direction of the motion, i.e. even when, according to classical 
mechanics, the particle should be accelerated without change of 
direction. 

So far as the relation between the wave-mechanical and the classical 
equations of motion is concerned, this uncertainty in the direction or 
in the ‘sign’ of the velocity, when its magnitude and the position of 
the particle are simultaneously fixed, is much more useful than 
Heisenberg’s uncertainty principle (which is another aspect of the 
fundamental ambiguity inherent in wave mechanics). It leads us to 
expect that the results predicted by wave mechanics will approach those 
predicted by classical mechanics as the reflection coefficient tends to zero, 
ie. when the ambiguity due to the possibility of reflection as well as 
transmission vanishes. In this case, transmission, i.e. motion in the 
same direction, is the only issue that comes into consideration. 

It is easy to see that a decrease in the reflection coefficient is brought 
about by a decrease in the wave-length. When the wave-length becomes 
very small compared with the length over which the potential energy 
changes by an appreciable amount, the reflection produced by this 


6 CLASSICAL MECHANICS AS LIMITING FORM §1 
change of potential energy also becomes very small and vanishes in the 
limiting case A = 0. 

This result can be illustrated by the fact, pointed out in Part I, § 12, 
that cathode rays pass without appreciable reflection through an 
electric condenser whose thickness is very large compared with the 
wave-length, while they are appreciably reflected if this thickness is 
reduced to zero, the potential energy change remaining the same. In 
the latter case the reflection and transmission coefficients are given 
by the well-known formulae 

R= (“=%), Be 1 ee ae , 
a’ +o (cx’ +”)? 
where a’ and a” are the values of the parameter a on both sides of the 
discontinuity. It may be recalled that this parameter is proportional 
to the momentum g = mv, i.e. to the velocity of the electron. When 
the velocity of the impinging electrons, that is «’, increases, the jump 
AU of the potential energy remaining constant, «” also increases, while 
the difference «’—«” decreases. We have in fact, according to (3), 
h2 


—_ ve F rr ee, ; ae "2 
AU = Ut—U' = 5. (a'*—a"2), 
whence a’ —a” = 8xtm AU : 
h? a’+a” 


or approximately 
a’—a" 822m AU AU 
ata” A? 4a® 4(W—U)’ 
: ~ 1| AU |? 
that is, it S 16| Wo) ° (5a) 
Here W—U is the average kinetic energy }mv*® of the electron on both 
sides of the discontinuity, while AU is equal to the change of this 
kinetic energy, i.e. approximately mv Av. 
1/Av\?_—_-1/AA\? 
We thus get R= (=) = its) ; (5b) 
where =m , 
mv 
Formula (5a) shows that the reflection coefficient tends to zero when 
the velocity of the electron is increased, i.e. when the wave-length A 
tends to zero, the jump of potential energy AU remaining constant 
(AA is an infinitely small quantity of a higher order than A itself). 
This result holds, of course, not only for electrons but also for any 
other particles: their behaviour conforms more and more to the funda- 


$1 MOTION IN ONE DIMENSION 7 


mental principle of classical mechanics, the principle of determinism. 
which can be stated in the form 

R= 0, Dei 
as their velocity increases. 

It should be noted that, for a given value of AU, the magnitude of 
the velocity for which R becomes inappreciable is the smaller the larger 
the mass m, since, according to (5a), it is not the velocity itself but the 
kinetic energy }mv? whose ratio to AU determines R. 


2. Comparison between the Schrédinger and the Classical 
Equation of Motion in One Dimension; Average Velocity and 
Current Density 

Discontinuities in the potential-energy function U(x) do not, of course, 

occur in Nature. When U(x) is a continuous function of 2, i.e. when 

the force has a finite value, it is possible to give another important and 
interesting formulation of the condition under which the fundamental 
ambiguity of wave mechanics disappears (i.e. the reflection coefficient 
vanishes), the wave mechanics thus reducing to classical mechanics. 

According to de Broglie’s relation A = h/mv, the wave-length of the 

waves associated with the motion of a particle is, other things being 

equal, the smaller, the smaller the value of the constant h. In reality, 

of course, the latter cannot be changed. If, however, it were not a 

universal constant, but could have any value whatsoever, then it would 

be possible to say that wave mechanics would reduce to classical 
mechanics in the limiting case h = 0; for this would mean that the 
wave-length would vanish for all values of the velocity. Consequently 
the relative change of the potential energy in a distance of the order 
of magnitude of the wave-length would also vanish, and with it the 
partial reflection which is the fundamental cause of the ambiguity 
characteristic of wave mechanics. 

This result can be proved in a general way as follows: 

Let us put «= 27g/h in equation (3a), where g(= mv) is the 
magnitude of the momentum of the particle, and also 


27 
¢ — Ma (6) 
Multiplying (3a) by (h/27)*, we get 
h\? d?A ds \* 
(>=) aat+[*-(3z) [4-9 (6a) 


where g* = 2m(W—U). (6 b) 


8 CLASSICAL MECHANICS AS LIMITING FORM §2 
It follows from this equation that in the limiting case h = 0 the func- 
tion s remains finite and is determined by the differential equation 


(=) = 2m(W—U). (7) 


The momentum g can be determined by this function unambiguously, 
i.e. both with respect to magnitude and sign, by the equation 
= =i (7a) 
which is equivalent to equation (5), corresponding to the one-sided wave 
propagation, i.e. to the motion of a particle in a definite direction. 
This direction remains arbitrary, since (7) has two solutions, namely 
ds/dx = +./{2m(W—U)} and de/dx = —./{2m(W—U)}. But once it is 
chosen for some initial instant it will remain constant so long as s is a 
continuous function of x without: maxime or minima, where, of course, 
g will change its sign after passing through the value g = 0. This 
change of sign through a continuous variation corresponds to total 
reflection and has nothing to do with the discontinuous reversal of the 
sign of g which is allowed by the exact theory embodied in the wave 
equation (1) (with h > 0) and which corresponds to partial reflection. 
The difference between the exact equation (1) and the approximate 
equation (7), so far as the ambiguity in the sign, i.e. in the direction 
of the velocity, is concerned, consists in the fact that the former, being 
a linear equation of the second order, admits both signs simultaneously 
(superposition of waves travelling in opposite directions), while the 
latter, being a quadratic equation of the first order, admits ezther one 
sign or the other. It should be remembered that the exact equation 
which is satisfied by the function s is much more complicated than (7). 
This exact equation can be obtained by eliminating A from equations 
(3a) and (3b) with ¢ = 2z8/h. 

It is often convenient to use, instead of the function defined in this 
way, another function S defined by the equation 

xp — et2rSih (8) 


h 
or S= 5 ORY. (8a) 
This S is connected with s (i.e. the ‘phase’ ¢) and the ‘amplitude’ 4 by 


the relation h 
S = s+ 5 log A. 


$2 SCHRODINGER’S AND CLASSICAL EQUATION OF MOTION 9 
It is a complex quantity which represents both ¢ and A and is equi- 
valent to yw. 

Substituting the expression (8) in Schrodinger's equation (1) and 
using the relations 


dy _ ,2nd8 er2nSih. ay (=) (=) eitrSIh 4 4 2m d? TP gitnsih, 


ial or dxt ~\h} \ae h da 
h @S_ (dS 
we get mp (= 7) = 2m(W—U). (8 b) 


If we put here h = 0, this equation reduces to (7), so that when h = 

the two functions s and S become identical. We must now baventlente 
the meaning of the approximate equation (7) which they both satisfy 
in this limiting case. 

In a certain sense it merely expresses the law of the conservation of 
cnergy—-since ds/dx is, by definition, the momentum g of the particle 
and = (z) is its kinetic energy. 

The equation is unusual, however, in that the momentum of the 
particle, and consequently its velocity, is determined as a function of 
the coordinate z, whereas in the classical description of motion the 
velocity, as well as the coordinate itself, usually appear as functions of 
the time ¢. Such a description of motion is impossible in wave mechanics 
because of the uncertainty in the direction of the velocity. If it is 
true, however, that in the case h = 0 the wave-mechanical equation 
of motion (8b) must reduce to the classical equation, then equation 
(7) must be equivalent to Newton’s equation of motion 


dz dU 
dt ~~ de’ ©) 
defining x and v = dz/dt as functions of the time. This equivalence is 
readily recognized as soon as we realize what is meant by defining the 
velocity (or momentum) of a particle as a function of its coordinate. 
Let us suppose that equation (9) has been integrated, and that z and 
v have been determined as functions of the time ¢. Then, eliminating 
the time ¢ between them, we can express one of them, e.g. v, as a func- 
tion »(z) of the other. The acceleration d*z/dt* can then be calculated 


by means of the formula 
dz dv dvdzr_ dv d (5) 


di dt dzedt dx da 
3505.6 c 


10 CLASSICAL MECHANICS AS LIMITING FORM §2 
so that equation (9) can be written in the form 


BE Oe 
dx (2) = dx 
or “+ U = const. 


If mv = g is replaced by ds/dz and the constant is denoted by IW, we 
get equation (7). 

We thus see that this equation expresses not only the law of con- 
servation of energy, but at the same time the classical law of motion. 
It should be mentioned that both laws are equivalent to one another 
only in the special case which we arc considering here of motion in one 
dimension (see below). 

Another way of interpreting equation (7), or rather the fact implied 
in it that the velocity v = ie of the particle is determined not as 
a function of the time but as a function of the coordinate z, is to 
replace the single particle under consideration by an infinite number 
of copics of this particle, filling space (or the line x) in a continuous 
way, so that at any instant ¢ a copy is to be found situated at, or rather 
passing through, any point x. This method is similar to one used in 
hydrodynamics except that, in the hydrodynamical case, the copies of 
a particle are replaced by actual particles (supposed to be identical), 
moving under the combined influence of external forces and forces of 
mutual action (represented by the hydrostatic pressure). Provided we 
are not interested in the individuality of the particles, i.e. in the question 
which particle is to be found at a given point, the motion of the particles 
can be specified by defining the velocity of the particle passing through 
each fixed point as a function of the coordinates of this point and, in 
general, of the time. If the velocity does not depend upon the time 
(it should be remembered that the velocity we are speaking of refers 
not to a definite particle but to a definite point) the motion is called 
stationary or steady. 

Thus the picture which can be associated with equation (7) is that 
of an assembly of copies of the particle under consideration, streaming 
steadily and filling space in a continuous way. If we select from this 
assembly a definite copy which at the time ¢ was passing through the 
point z, then, knowing the dependence of the velocity v upon z, we 
can follow its motion and determine both the velocity and position of 
thie particular copy as functions of the time. For instance, at the 


g2 SCHRODINGER’S AND CLASSICAL EQUATION OF MOTION 11 
moment t-+-dt the copy in question will be situated at the point z+ dt, 


and will have the velocity v(r+dzx) = v(x+v dt) = v(xz)+ we dt, which 


means that its acceleration is equal to vdv/dz, as was obtained above. 

We have thus shown that the wave-mechanical equation of motion 
actually reduces to the classical equation in the limiting case when the 
wave-length associated with the motion of a particle tends to zero, 
either owing to increase in velocity (which is a thing that can actually 
happen) or to decrease in the constant (which is an artifice). The 
fundamental reason for this lies in the elimination of partial reflection, 
ie. of a reversal in the direction of the velocity or, in other words, the 
elimination of the uncertainty in its sign. 

Strictly speaking, however, this uncertainty cannot be eliminated. 
It is impossible to describe the motion of a particle in the classical way, 
i.e. as a determinate change of position and velocity with the time. 
The only way of describing it is to ascertain the probability of finding 
the particle at a given place and the probability that, being at this 
place, it is moving in the one or the other direction (the magnitude of 
the velocity being fixed). This intrusion of the probability conception 
into the description of the motion is necessary because of the ambiguity 
arising from the alternative: partial reflection or partial transmission. 
One could say that this ambiguity—wholly alien to classical mechanics 
—forms the gate through which the concept of probability penetrates 
into the realm of physics. 

The probability of position is measured, as we know, by the product 
yp*, so that &(x)*(x) dz measures the probability that the particle is 
situated in the region between x and z-++dz. Using the picture of an 
assembly of copies of the particle in question filling space (or the z-axis) 
in a continuous way, we can interpret ys* dz as the relative number 
of copies situated within the interval dz (this number is independent of 
the time so long as 4 = w%e-*?", corresponding to a motion with a 


definite total energy W = hy). If the integral Tew dx converges, 


y can be normalized in such a way that this integral is equal to 1, in 
agreement with the usual normalization of probability. Otherwise we 
need not worry about this normalization, since after all only relative 
values of yb* for different points come into account. 

It should be noticed that in the classical description of the motion 
we can also use a continuous assembly of copies instead of an individual 
particle, as is actually done when the equation of motion is written in 


12 CLASSICAL MECHANICS AS LIMITING FORM §2 
the form (7) corresponding to the determination of the velocity as a 
function of the coordinate and not of the time. From the point of 
view of this description the difference between the old and the new 
theory can be summed up as follows. In the old theory it is always 
possible to ‘individualize’ a certain copy by following its motion, i.e. by 
determining its coordinate and velocity as definite functions of time, 
whereas in the new theory such ‘individualization’ is impossible, the 
direction of motion being uncertain. It thus becomes necessary to con- 
sider the assembly as a whole without attempting to disentangle it, 
i.e. to trace the motion of a particular copy in time. This being so, the 
density of the assembly, i.c. the relative number of copies per unit 
range, or, in other words, the probability of finding the particle repre- 
sented by these cupies in a given range, becomes the primary thing 
that can and must be determined—whereas in classical mechanics it 
remains irrelevant and therefore arbitrary. Of course the determination 
of yxb* in wave mechanics is also connected with some arbitrariness, 
which can only be removed by specifying the boundary conditions or 
the conditions at infinity for the function y. 

Knowing the function y, one can determine many other things besides 
the probability of position. Thus by means of it we can determine the 
probability of the two opposite directions of motion, that is, of the two 
opposite signs of the velocity, if the magnitude of the velocity is 
assumed to be fixed for a given position by the classical relation 
v = {2(W—U)/m} or by de Broglie’s relation v == h/(maA). If p’ is 
the probability of the positive direction and p” that of the negative 
direction, then the average or probable value of the velocity at « given 
point is given by the formula 


5 = (p’—p")|»| (10) 


with the condition p’+p” == 1. 

This probable velocity, or the probabilities », can be determined 
quite generally with the help of the relation (4), as soon as the physical 
meaning of this relation is recognized. We shall first see what the 
expression A* d¢/dz means in the simple case of a wave travelling in 
one direction in a force-free space, that is, a wave representing the free 
motion of a particle in one direction. We have, in this case, according 
to (la), d = ax and consequently A? == A%qy = yr = — |b] ?v. 
If |¥|* is interpreted as the (relative) density of the copies of the 
particle, then the product [|v = 7 must obviously be defined as the 


§2 SCHRODINGER’S AND CLASSICAL EQUATION OF MOTION 13 
corresponding current density, i.e. the (relative) number of copies passing 
through the given point or plane z = const. in the direction of v in 
unit time. If ||* is interpreted as the probability density, then j can 
be defined as the probability current density, i.e. the probability that 
the particle will cross the plane x = const. in unit time. The ratio 
j/\%|? is nothing else than the actual velocity of motion, which, in view 
of the fact that the direction of the motion is perfectly definite, coincides 
with the probable velocity 6 (p’ or p” = 1). 

It is natural to extend the above interpretation of the expression 
A? d¢/dz as a measure of the current density to any type of wave 
function ¥, for from this point of view the fact that A? dd¢/dz is constant 
(i.e. independent of x) simply means that the number of copies passing 
through different planes x = x, and x = 2,, say, is the same, just as 
if they were actual indestructible particles. The law expressed by the 
relation (4) would thus be the law of conservation of the number of 
copies or of the conservation of probability (see below). If this inter- 
pretation is correct, then it must obviously be possible to write 7 in 


the form j= pps, (10a) 
where # denotes the probable velocity of the copies at the point in 
question. Now this is actually the case if j is defined as = A at 


(the coefficient h/27m is the same as in the special case sian 
above), which gives the following expression for the probable velocity 


Pe (10b) 


The ‘phase’ ¢ can be expressed in terms of the function ys = Ae‘? and 
its conjugate complex y* = Ae-‘? by means of the formula 


$= slog (Hid) 


whence it follows that 


= h (ldp 1dp*\ kh (; a 
ae malge-# ze) a aa 8) yen) 
; 2. Le fes 

or, according to (88), += aF (a) 


R(f) denoting the real part of f. In the classical theory this equation 
reduces to 0 = v, in accordance with the fact that the motion proceeds 
in a perfectly definite direction, the probabilities p’ and p” being equal 
respectively to 1 and 0. In the wave-mechanical theory |d| is, in general, 


14 CLASSICAL MECHANICS AS LIMITING FORM §2 
different from |v|, the values of the probabilities p’ and p” being dif- 
ferent from both 1 and 0. They can be determined from v and 6 by 


means of the formula 
_1 v 
P= 2\ lel} 


Substituting (10c) in (10a), we get the following expression for the 
current ict 


ee eee ae (11) 


We shall now check these results by applying them to two simple cases. 
We shall put first 


yp — A’etat AX%e-iar, 
which corresponds to the free motion of a particle along the x-axis in 
an unspecified direction. 

Assuming the coefficients A for the sake of simplicity to be real (this 
condition does not involve any loss of generality, for it can always be 
satisfied by a suitable choice of the origin x = 0), we have 

p* — A’e-tart AXeian 
whence 
Ya db 2 ane t Av eitar__ 4! A"e-itar 
sea = A —A"*)+a(A'A"et2ar_ 4’ A"e-i2a7) 


= a(A’*—A”*)4+7204’A” sin 2ar, 
so that j reduces to the constant value, 


or j = |v|(A’"2—A"?). (lla) 
Unlike j, the probable velocity 
j= Fe 1 pape ae 
pt A’24 A941 24'A" cos 2ar 

is a function of x, varying periodically between the values 

7 A’+A" 

Vnax = = |v v7 , —A’ 

ne 

and Din = lan 


The fact that the maximum value of the probable velocity turns 
out to be larger than the magnitude of the classical velocity |v| in- 
validates the idea considered above of taking the latter over into the 
wave-mechanical theory as the magnitude of the ‘actual’ velocity. With 


§ 2 SCHRODINGER’S AND CLASSICAL EQUATION OF MOTION 15 
|8/v| > 1 formula (10) leads to values of the probabilities p which arc 
devoid of physical meaning, one of them being larger than | and the 
other smaller than 0. Although the classical velocity can be determined 
wave-mechanically from the wave-length \ (by means of the formula 
\v| == h/(mA)), yet it is the probable velocity 6 only which has a direct 
physical significance. 
This is also clearly seen if we take as a second example the case 
yb = A’etBr4 A" e-Bz 

corresponding to a region of total reflection where the kinetic energy is 
negative and the velocity v is imaginary. We have in this case ¢* = yf, 
j == 0, and @ = 0, as might be expected. 


3. Generalization for Non-stationary Motion in Three Dimen- 

sions; The Hamilton-Jacobi Equation 
We shall now generalize the results of the preceding section to the 
motion of a particle in three dimensions under the action of forces 
derived from a potential-energy function U which may depend not only 
upon the coordinates x, y, z, but also upon the time ¢. 

The wave-mechanical description of such a motion is given hy the 
generalized equation of Schrédinger 

822ml h o ? 
vy are (5-3 a uv) aig (12) 

Our main object will be to trace the relation of this equation to the 
corresponding classical equations of motion, 


da: eu d?y GU d*z aU 
De ets ee oe (A es ae Gael eee 12 
dt? ex’ ” Oe? oy’ de ce mee) 


The general character of this relation can be described in a way similar 
to that used for the one-dimensional motion discussed above. The 
fundamental characteristics of the wave-mechanical theory can thus be 
partially reduced, as before, to the ambiguity arising from the pheno- 
menon of partial reflection and partial transmission—a phenomenon 
which implies a sudden change in the direction of the velocity, its 
magnitude being assumed to be the same function of the coordinates 
as in the classical theory. 

The uncertainty in the direction of the velocity, which in the case 
of one-dimensional motion was equivalent to an ambiguity of sign, is 
now—in the case of motion in space—of a still more distressing 
character. However, we may still expect this uncertainty, as well as 
partial reflection, to vanish in the limiting case of motion corresponding 


16 CLASSICAL MECHANICS AS LIMITING FORM §3 
to infinitely short wave-lengths (which can be realized by an increase 
of velocity or of mass, or by a fictitious decrease of the constant A). 
Thus in this limiting case equation (12) must become equivalent to 
equations (12a) in the sense of admitting particular solutions corre- 
sponding to a perfectly definite type of classical motion. 

To demonstrate this equivalence we shall replace the particle under 
consideration by an assembly of copies distributed and moving in space 
like the particles of some continuous fluid (without interaction of 
course!). The velocity vector v of each copy can then be defined— 
according to the classical theory—as a function of the coordinates 
x, y, 2 of the (fixed) point through which this copy is passing, and of 
the time—the motion being not necessarily a steady one. It should be 
noticed that the partial derivative dv/ét of v with regard to the time 
docs not define the acceleration of a given copy, for it refers to diffcrent 
copies passing through the same point at different instants of time ¢ 
and t+dt. This acceleration can be defined by the total derivative 
dv /dt, its x-component being thus given by 


dv, Ov, , Gv, dr | Or, dy , ev, dz 


Gt Ht a dt’ y dt? a a 


dv, ev, oy, , . or ov 
or a = ag Pte te ee (13) 


We shall now assume the motion of the fluid formed by our assembly 
of copies to be irrotational, which means that the velocity vector can 
be represented as the gradient of a scalar function, the so-called ‘velocity 
potential’. We shall denote this function by s/m and put accordingly 


mv = Va, (13 a) 
‘ 08 1 8 1 a8 
that Be atelier, = ae cng ie ee Et 
at is vx er Vy ‘a Be v, = 


We make this assumption (which is by no means necessary) not only 
because we desire to simplify the formulation of the classical theory as 
applied to the copy assembly, but also because we wish to establish the 
connexion between this theory and the wave-mechanical theory. We 
have in fact, for a wave propagated in one definite direction, a relation 
exactly similar to (13a) between the phase ¢ and the vector « whose 
direction is the direction of propagation and whose length is 27/A, where 
d is the value of the wave-length at the corresponding point: 


a = Vd. (14) 


$3 MOTION IN THREE DIMENSIONS 17 
If we put @ == an, (148) 
according to de Broglie’s relation, we get 

gas (14b) 


as before [cf. (6), § 1]. Thus, by assuming irrotational motion of the 
ussembly of copies, it becomes possible to establish a connexion between 
the motion of a particle and the propagation of waves in the limiting 
case of infinitely short waves, i.e. when partial reflection is excluded 
and the motion of every copy of the particle proceeds along a perfectly 
definite path; this path can be considered as the ‘ray’ passing through 
the point at which the copy in question was initially situated. If partial 
reflection does take place the idea of rays loses all meaning, each ray 
branching into two at every point. Only by neglecting reflection can 
one speak of rays as lines along which the waves, i.e. the surfaces of 
constant phase, are propagated. 

Returning to the expression (13) for the z-component of the accelera- 
tion of the copy passing through the point z, y, z, at the instant ¢ we 
can, because of (13a), rewrite it in the form 


dv, 
a Gitte Eb 0 bo, 


dv, 1 as Wy 
et SS oe ee SS tc. Theref 
since By ‘a baby a , ete. erefore 


dv, __, , a/v? 
dt = &t_-* ax\2 


dv,_ 1 0fa, lions 
o tn elaetim 
dv, au 
The equation m —4 Ty ane which is the first of the equations (12a), is 


thus equivalent to 


d [as 
slats, im eT v| ea 


Similar results are obtained for the second and the third equations, and 
so all three of them can be replaced by the single equation _ 


4 (grads)'+U = F(t), 


where F(t) is an arbitrary function of the time alone. This function, 


without loss of generality, can be put equal to zero, for it corresponds 
3595.6 D 


18 CLASSICAL MECHANICS AS LIMITING FORM §3 
to an additive term f F(t) dt in s which is irrelevant for the determina- 
tion of the velocity according to (13a). The function s can thus be 
defined by the equation 


és _ 1 [/és\* _ (ds\? | (de\*' -_ 
at al as) +(55) + (5) [+2 = © i 


This equation was established by Hamilton and Jacobi and bears 
their name. In the special case when U does not depend upon the time 
explicitly (constant field of force), the function s—usually called the 
(mechanical) ‘action’—reduces to 


$8 = 8,(x, y,2)— We, (15a) 
where s, is determined by the equation 
L 08 2 039 2 (08, 15 
anl(ss) +(3) +) [F7= "ash 


Here W is a constant which can obviously be defined as the energy. 
Thus, in a sense, equation (15b), in conjunction with the relation 
(13a), expresses the law of the conservation of energy. However, as we 
have just seen, it expresses much more than that, f since, in conjunction 
with (13a), it is equivalent to the three classical equations of motion 
(12a) for the special case of an invariable field of force and of a fixed 
value of the total energy. The equations (12a) and (15b)—or more 
generally (15)—are formally different because the former refer to an 
individual particle, while the latter refer to a continuous assembly of 
copies of this particle. If we select a definite copy and follow its motion 
we come back to equations (12a). 

It can now easily be shown that in the limiting case of infinitely 
small wave-length the wave equation (12) admits particular solutions 
of the form y = Ae'?, representing a one-sided propagation of waves 
which can be associated, by means of the relations (14), (14a), and 
(14b), with the motion of the particle in question according to the 
classical theory, the different ‘rays’ coinciding with the paths of the 
different copies of this particle. 

Putting yy = ee we get in the same way as in § 1 


oa S =e Feit 4 2ieib A 4 < idet Ke — e422), 
whence 
vay = SE EY = eit VA —A(V$)*4 i(2V4-Vh 4-404), 


Taye t ag 


¢ Except in the one-dimensional case. 


§3 MOTION IN THREE DIMENSIONS 19 
We have further 


—_ == @ 


op 0A i¢ ig 
at —sCat Tae ot’ 


Substituting these expressions in equation (12), cancelling the common 
factor e'?, and separating the real and imaginary parts, we obtain the 
two equations: 


vas [orem ( h ad 


satay fee cae v)—(v6r|4 =o 


2m ot 
ad = . 4.2VA-Vb+AV94 = 0. 
Qn 


If ¢ is replaced by h 


8, these equations become 


 — és. 1 : _ 
: Nia aa tg Ae ed, (16) 


822m 
2m oe 42VA-Vs-+AVs = 0. (16a) 


Putting 4 = 0 we see that the first of these equations reduces to the 
Hamilton-Jacobi equation (15). The same result is obtained if V2A = 0, 
which must obviously express the general condition for one-sided pro- 
pagation of waves of finite length. In both cases the wave-mechanical 
theory becomes completely equivalent to the classical. theory. Both 
cases are, of course, fictitious, h being a constant and the equation 
V?A = 0 being satisfied only under very special conditions—in 
particular for force-free motion. The equation (16) can, however, 
reduce ‘approximately to (15) in the case of a nearly one-sided wave 
propagation with a very weak partial reflection—so weak that the 
reflected (or scattered) waves can be neglected. This condition is more 
nearly approached the larger the mass m of the particle for a given 
velocity or the larger the velocity for a given mass, i.e. the smaller the 
wave-length, if we are treating motion corresponding to a constant 
value of the energy W. In the latter case the wave-length becomes 
a definite function of the coordinates. In the general case the idea of 
wave-length has no precise meaning and can be introduced only by 
representing the wave function % as a superposition of waves with 
different frequencies, corresponding to motions with different energies. 

If U does not contain the time explicitly, equations (16) and (16a) 
admit particular solutions of the type s = s8,(x,y,z)—Wt and 


20 CLASSICAL MECHANICS AS LIMITING FORM §3 
A = A(z, y,2), i.e. a8/ot = — W and 2A/dt = 0. They therefore reduce to 


ee ] ‘ a 
= os! ae = 1 
soa VIA + 5 (Veg) + U = W (17) 
and 2VA:V8,+AV%s, = 0. (17 a) 


In the limiting case h = 0 the first of these becomes equivalent to the 
classical equation (15 b). 

This equivalence, as well as the approximate equivalence which can 
be obtained in the case of large values of W or m, must not be misunder- 
stood. It refers to particular solutions of equations (17) and (17a), or of 
the corresponding Schrédinger equation 


vy + (WU = 0 (17 b) 


with yp = Aet2nma—-Whik — Y(a, y; z)je~ arti (17 c) 
that is, to solutions which represent—approximately—unives travelling in 
a definite direction (the direction may, of course, vary from point to 
point, being defined by the direction of the ‘rays’ passing through these 
points). Now the general solution of (17b) in the case of short waves 
can be represented as a superposition of a number of such particular 
solutions corresponding to waves travelling in different directions. under 
the limitations imposed by boundary conditions (in the case of long 
waves this is possible for force-free motion only). The classical equation 
(15 b), on the other hand, does not admit of such superposition for the 
function y% defined as Ae‘?7%/*, This can clearly be seen in the simple 
case of one-dimensiona)] motion where A is connected with s by the 


relation ag = C [cf. (4), §1), so that » = NG et27s/h| The physical 


(ds/dz) 
reason for this is that ‘superposition’ of two different types of motion 
would mean, according to classical mechanics, their ‘simultaneous 
realization’-—an obviously impossible thing if they are alternative. In 
wave mechanics, on the contrary, it is just this alternative character 
which is expressed by superposition, the latter corresponding to the 
addition law of the classical probability theory. Similar results apply 
to the general equations (12) and (15), the former allowing the super- 
position of processes with different energies if U does not depend upon 
the time—while the latter reduces in this case to equation (15 b) corre- 
sponding to one definite value of the energy W. 

The non-validity of the superposition principle in classical mechanics 


can easily be demonstrated with the help of the function S = logy 


§3 MOTION IN THREE DIMENSIONS 21 


introduced in § 2 [eq. (8)]. This function satisfies the differential 
equation 


h 2 oS 2 ie 
a Vs+5- = (VS) +U=0 (18) 


which is obtained from aes equation (12) by the substitution 
yp == e!2"Sik and which reduces to the Hamilton-Jacobi equation (15) if 
h is put equal to zero. The function S thus coincides in this case 
with the function 8, which means that the amplitude A can be con- 


sidered as practically constant. 
h 


Now if in the Hamilton-Jacobi equation (15) we put s == S = > slog y, 
we get the following ‘approximate’ equation for : 
h op hh? 
scons etait setiaiinicl SP A 2 == 
hE — a (Vb Uy = 0 
822m { h oy. 
Vivtas="* 2 — I8a 
or (Vy) "(i = +U)¥ = 0, (18a) 


which is quadratic and of the first order (like the equation for S) instead 
of being linear and of the second order like the-exact equation of 
Schridinger. If ¥, and y, are two particular solutions of (18a), the 
function ¥% = ¥,+y, will not in general represent a solution of this 
equation. 

Returning to the representation of the exact wave function in the 
form Ae‘? -= Ae'274!k, and considering equation (16a) connecting A and 
8, which has been disregarded hitherto, we see that this equation can 
be simplified if multiplied by A. We have in fact 2A a = _ and 


2AVA-Vs-+ A2V%s = V(A?*)-Vs--A2V26 == ee 2V3); 
so that. Ae ) -|- div(4? v5) = (), (19) 
This equation is of the same form as the equation of continuity, i.c. the 


equation of the conservation of mass in hydrodynamics or of the con- 
servation of electricity in clectrodynamics, 


where p is the density of mass or electrical charge and j the corre- 
sponding current density. In the present case we can interpret the 


quantity A? = yb* = p 


as the density of the copy assembly (i.c. the relative number of copies 


22 CLASSICAL MECHANICS AS LIMITING FORM §3 
of the given particle in unit volume) or the density of probability. If, 
further, we define the corresponding current density by the formula 


1 ~ A*Vs, (19) 


then equation (19) will express the law of the conservation of the copies 
or of the probability. In the classical theory the vector Vs/m reduces 
to the actual velocity v of the particle (or more exactly of its copies 
at the given point), so that j assumes the usual form of the product 
of p with v. In the exact wave-mechanical theory it can also be written 


in the form 
j= pv 


where the vector v= “Ve (19 b) 
must obviously be interpreted as the probable velocity. The classical 
velocity can be computed as usual by means of the formula 


r= [[b0"-0) 


its direction being, however, uncertain. According to the definition of 
A and s, we have ¢ = Ae‘?7#/h, y* — Ae-‘278/h whence 


o= 4708 - 
and consequently 
= (WeTW— wpe) = SR: weVy), (20) 
Introducing the function S = logy we get VS = mig and 
; le T yy ae = WATS, so that 
j = —WitR(VS) (20a) 
and v= —R(VS). (20 b) 


Comparing this with (19b), we see that the function s is equal to the 
real part of S, in accordance with the relation S = s-+ log A which 


results from comparing the two expressions e275!" and Ae‘? for x, 
The probable velocity (20b) could be represented in the form 


v= |r| | ap(n) dw, 


§3 MOTION IN THREE DIMENSIONS 23 
where n is the unit vector which defines the direction of the classica} 
velocity and p(n) dw is the probability that this unit vector lies in the 
infinitely small solid angle dw. An unambiguous determination of this 
probability appears, however, to be impossible, except for one-dimen- 
sional motion considered in the preceding section. This is quite natural 
if we remember that the notion of classical velocity, as measured hy 
the time derivative of the coordinates, cannot be taken over into wave 
mechanics. 

It should be mentioned in conclusion that the relation between wave 
mechanics and classical mechanics is usually compared with the relation 
between wave optics and the so-called geometrical optics, the latter 
being defined as the limiting case of wave optics for very small wave- 
lengths. This statement would, however, be misleading unless we add 
to it that in geomctrical optics partial reflection of light (which actually 
decreases with decrease of wave-length) should be wholly left out of 
account—even in its simplest form on the boundary surface between 
two homogeneous media. In this case—and only in this case—is it pos- 
sible to introduce the idea of rays as lines along which the propagation 
of light takes place (this is why geometrical optics is often called ‘ray 
optics’ in contradistinction to wave optics, where the idea of ‘rays’ has 
in general no meaning). It was the merit of Hamilton to show, one 
hundred years ago, that in this limiting case the wave conception of 
light can be replaced by the corpuscular conception, and that the rays 
can be described as the paths of light particles moving, according to 
Newton's classical law, in a certain field of force. The potential energy 
of this field of force U is determined by the refractive index » according 
to the relation py? = y(W—U), 


where y is a constant depending upon the definition of the mass of 
a light particle.t But perhaps the main merit of Hamilton’s work was 
that he applied the same considerations to the motion of particles of 
ordinary matter, thus for the first time associating such motion with the 
propagation of (infinitely short) waves and describing it by equation (15). 
This association of particles with waves, which in Hamilton’s theory 
was achieved by interpreting the ‘mechanical action’ s as a measure 
of the phase function ¢, was, however, completely forgotten for 
a hundred years, until de Broglie rediscovered it in the way described 

+ This relation is obtained in the simplest way by comparing de Broglie’s formula 
for the wave-length 1/A = \{2m(W—U)}/h with the formula A/A = yu, which can be 


considered as the definition of the refractive index, Ay being the value of A in vacuo, 
i.e. for a place where » = 1. 


24 CLASSICAL MECHANICS AS LIMITING FORM §3 
in Part I, and Schrédinger introduced his wave equation, whose relation 
to the Hamilton-Jacobi equation has been discussed above. 

This mutual reaction of optics and mechanics must not be misinter- 
preted as an indication of a true analogy between them—in the sense 
of a wave-corpuscular duality of light. We must not be led by it to 
infer the real existence of photons, moving in material bodies according 
to the laws of wave mechanics. For we could replace optics by acoustics, 
i.e. light vibrations by mechanical vibrations propagated in the form 
of waves in elastic media according to an equation of exactly the same 
kind as the differential equation for the light waves. In the limiting 
case of infinitely short acoustical waves we could therefore obtain 
exactly the same results as in optics, i.e. a kind of ‘ray acoustics’ 
instead of a ‘wave acoustics’. This would enable one to formulate a 
corpuscular theory of sound and describe the propagation of sound as’ 
the motion, according to wave mechanics, of certain particles—e.g. 
‘phonons’. I do not think, however, that anybody would believe in the 
reality of such ‘phonons’. This does not mean, of course, that the 
photons are equally unreal, for the analogy between acoustics and optics 
is just as superficial as that between optics and mechanics (or acoustics 
and the mechanics of single particles)—I am inclined, however, to 
think that photons have no more reality than ‘phonons’, and that they 
are created by a ‘reflection’, as it were, of the wave-corpuscular duality 
of matter in the phenomena of light (cf. Part 1). 


4. Comparison of the Approximate Solutions of Schrédinger’s 
Equation; Comparison of Classical and Wave-mechanical 
Average Values 

Although in the case h = 0 the functions s and S satisfy the same 

equation—namely, that of Hamilton and Jacobi—yet the approximate 

expressions for y obtained therefrom, according to the formulae 
yp = Aet2nh and 4 = e#7Sih, turn out to be somewhat different, for the 

‘amplitude’ A obtained by means of equation (16a) is in general a 

certain function of the coordinates (and the time), varying very slowly 

compared with the ‘phase factor’ 27s/h. 

The discrepancy between the two approximate solutions is due to 
the fact that the error introduced by putting A = 0 is larger in the 
case of equation (18), which contains / in the first power, than in 
the case of equations (16) and (16a), where 2 appears in the second 
power. In the latter case we thus drop a small term of the second order, 
while in the former case we drop a much larger term of the first order. 


§4 APPROXIMATE SOLUTIONS OF SCHRUDINGER'S EQUATION 25 
In order to remove this discrepancy we must put 


Be 42 8, (21) 


and after substituting this expression in equation (18) drop terms which 
are quadratic in h but keep those which are linear in h (S® and S’ being 
independent of h and therefore of the same order of magnitude). We 
thus get the pe gente ‘jag 
A v2 h oe a: 
V2S8° + —— 4 — 
4rrim +5 athe ot Tom 
Here S° must be regarded as the zero approximation, corresponding 
to h = 0, i.e. as the solution of the Hamilton-Jacobi equation 
as° 
at 
It can obviously be identified with the (approximate) function s. 
The function S’ must therefore ee the equation 


+ (ys0)2 + An VNO.VS' 4: U = 0. (21a) 
as. 0\2 = 
ps (V8) +0 = 0. 


* 9289 4 en 1 990.76" = 0, (21 b) 


whence it follows that S’ is a real quantity. Now according to (21) we 
have yb = e'27Sih — eS'ei27S/h so that, since S° = s, eS’ must be equal 
to A. Substituting in (21 b) 

S’ = log A, (21c) 


we do indeed get equation (16a). It may seem that by developing the 
function S in a series of powers of the parameter h/(277) 


Been A \* os 
S=94.".8 +(5=) S*4... 

270 27 
and solving the equation (18) by successive approximations, one can 
obtain as good an approximation for S as may be desired. This assump- 
tion is, however, incorrect, for it can be shown that the preceding series 
is divergent or rather semi-convergent, which explains why one gets 
a closer approximation by keeping the first-order term, as has been 
done above. In fact the general solution of a differential equa- 
tion of the second order cannot be approximated to by starting 
with the solution of the equation of the first order obtained by 
dropping the second-order terms, however small the parameter by 
which they are multiplied may be, just as a quadratic equation cannot 
be approximated to by the linear one obtained by dropping the quadratic 


term. If, however, the latter is multiplied by a small parameter, then 
3595.6 E 


26 CLASSICAL MECHANICS AS LIMITING FORM § 4 
one of the two solutions of the quadratic equation can be approxi- 
mated to by the solution of the linear one. A similar relationship exists 
between the function ~ = e?7S*/'+5" and one of the particular solu- 
tions of Schrédinger’s equation, representing approximately waves 
travelling in one direction. It should be mentioned that this direction 
need not remain constant; it can be changed by fotal reflection, which, 
in contradistinction to partial reflection, is a phenomenon perfectly 
compatible with classical mechanics since it does not involve any 
ambiguity and therefore does not challenge a deterministic description 
of the motion. The difference between classical mechanics and wave 
mechanics in the approximate form given above, in so far as total 
reflection is concerned, consists only in the fact that, according to the 
latter, the particle can penetrate into those regions of the field of force 
where its ‘classical’ velocity becomes imaginary. 

According to the relation v == V3/m = VS®/m, it should follow that 
the functions s and S® must also become imaginary. So far as S® is 
concerned this is perfectly true. The function s, however, according to 
its definition, must remain real. It will therefore be different from S® 
for those regions where v is imaginary and will satisfy an equation 
different from that of Hamilton and Jacobi. We must remember that 
equations (16) and (16a) were obtained on the assumption that both 
s and A were real. The assumption that s satisfies approximately the 
Hamilton-Jacobi equation, even when the latter gives imaginary values 
for it, would thus imply a contradiction. 

This means that, in the case under consideration, V?A must be very 
large and of the order of magnitude of 1/h?, so that the first term in 
equation (16) or (17), which when omitted reduces (16) or (17) to the 
Hamilton-Jacobi equation, cannot be dropped. We shall not consider 
the approximate solution of equations (16) or (16a) [or (17) and (17a)] 
for this case. It is simpler to use instead the alternative representation 
of % by means of the function S = S°+ S’h/(271) since we do not have 
to worry about the reality of S°. An imaginary value of S® leads, 
according to (21 b), to an imaginary value of 8’. The role of the func- 
tions S° and S’ as determining the phase and the amplitude respectively 
will thus be reversed for classically forbidden regions, so that, using 
the expression Ae‘274/t for 4, we can put 


A = ef27S4h — et2aiS*\[h (22) 
ae 
and s= 58 = + 5-|8 [. (22a) 


§4 APPROXIMATE SOLUTIONS OF SCHRODINGER'S EQUATION 27 
The sign (+ or —) is determined by the condition that 4 (i.e. ) must 
decrease with increased penetration into the forbidden region. It can 
easily be proved directly that the expressions (22) and (22a) constitute 
an approximate solution of the equations (16) and (16a) for the case 
in question if the functions S®° and S’ are determined respectively by 
the Hamilton-Jacobi equation and by equation (21 b). 

Returning to the case when 8° is rea] (and equal to s), corresponding 
to the motion in the classically allowed region of the field of force, let 
us examine the approximate values which are obtained for the ampli- 
tude A = e®. 

We shall first consider the simplest case of a one-dimensional motion 
with constant energy. We have in this casc, according to (4), 


A2 de -= const., 


ds 
that is, since os = 8, 
dx 
2 
ae, (23) 
|o| 
where C* denotes a positive number. We thus get approximately 


viel 
8,(x) being a solution of the equation 
1 /dsy\? _ 

sal de) ee ws 
Formula (23) has a very simple physical meaning. It shows that the 
probability of finding the particle within a certain region hetween x and 
x+dzx is inversely proportional to its velocity in this region. This is 
just what we should expect if this probability were defined as propor- 
tional to the time dt = dz/v which the particle spends in the region 
in question. We thus see that the interpretation of the quantity 
ynp* dx = A*dx as the relative probability of finding the particle in 
the region dz is in agreement, so far as the approximate expression for 
y is used, with the classical definition of probability in terms of 
duration. 

If f(z) is some quantity depending upon the position of the particle, 
and if the motion of the latter is confined to a limited region of the 
z-axis, e.g. between z, and z,, then the average value of this quantity 
in the sense of classical mechanics, i.e. with respect to the time, can 


28 CLASSICAL MECHANICS AS LIMITING FORM §4 
be defined by the expression 


T 
= 1 
Jo [Seva (24) 


taken for a ‘round trip’ of the particle, 7' representing the duration of 
this round trip. The round trip can obviously be replaced by a one-way 
trip, since the motion must proceed in the same manner on the two 
halves of a round trip, with the sign of the velocity reversed. We can 
thus put tk 


where /, and /, denote the time of starting from the point 7, and 
arriving at the point x, respectively. Replacing dt by dx/v, where v is 
a function of x determined by the equation v? = J [=(w—u (3), we 
get és 
—— _ be (24a) 


or, if a ‘round trip’ is taken instead of a ‘one-way’ trip, 
1 f f(x) 
== @ —d 
f T $ oo 


the velocity v being taken with the same sign as dz (i.e. + when z is 
increasing from x, to x,, and — when it is decreasing from z, to 7;). 

Now the expression (24a) for f is identical with that obtained by 
means of the wave-mechanical definition of the average value of f(x) 
according to the formula 


f= [ fen dx, (24b) 


if the function y is assumed to vanish outside the region (2,,2,) 
and is replaced by its approximate expression (23a) for this region. 
The normalization constant C must be determined by the condition 


{'yb* de = 1, that is, 
“a X t, 
c? | = —(¢ | dt = CXt,—t,) = 1. 
q, t, 


This agreement of the classical theory with the wave-mechanical theory 
must not be overestimated. As a matter of fact the function ¥ does 


§ 4 AVERAGE VALUES 29 
not in general vanish outside the classically allowed region, but, as we 


have just seen, decreases there approximately as e-?7!5"/4, According 
0 


‘ ld 
to the relation v = a , we can put (dropping the term containing 


the time) ee f vdz = f V{2m(W—U)} dz. (25) 


This formula applies just as well, i.e. with the samc degree of approxima- 
tion, to the points inside and outside the region (x,,z,). In the latter 
case, for a point x > 22, we can put 


|:8(2)| = i J2m(U —W)} dr, (25a) 


z 
QT ° ° 
— 22 f vam(Ui-)} dr i 
and consequently |b] == Ce n (25h) 


Thus, to the degree of approximation used, we should define the wave- 
mechanical average of f(x) by the equation 


F= | fe)pr az 


with pl? = a of [Eav—u)| 


|v 


for W > U, ie. forz, << 2 < 2p, 
47 rf 
—=- | vi2mn(U-W)] dc 
and \p|? = Cre wd 
for x > x, and a similar expression for x < 2,. The constant C must 


+n 
be determined from the equation { |x|? dz = 1. 


The difference between the classical and the wave-mechanical aver- 
ages becomes particularly important when there are two or more 
classically allowed regions separated from one another by regions for 
which W < U. The latter, being permeable to the particle from the 
wave-mechanical point of view, do not actually separate but, on the 
contrary, connect the former regions. 

The comparison of the classical ‘time-average’ with the wave- 
mechanical ‘probable value’ for the case of a three-dimensional motion 
is much more complicated than in the one-dimensional case and will be 
considered in the next section in connexion with the wave-mechanical 
interpretation of the quantum conditions. It must be remarked here 
that such averages or probable values have a meaning only when the 
motion is confined to a classically limited region, and that these limits 


30 CLASSICAL MECHANICS AS LIMITING FORM §4 
can be assigned a priori only in the case of a conservative motion, 
i.e. a motion with a given (constant) value of the energy W. Within 
the allowed region, limited by the surface I¥—U = 0, the amplitude 
function A must satisfy the equation 
div(A?Vs,) == 0, 

which can be solved after the function s, has been determined from 
the Hamilton-Jacobi equation (17). It should be remembered that this 
equation, which represents another form of equation (17a), expresses 
the law of the conservation of the copies of the particle, or of the 
probability of its location |cf. (19)]. 

Although there is in general no exact equivalence between the 
classical] and the wave-mechanical average values, yet there are special 
cases when this equivalence turns out to be exact. An interesting case 
of this sort is provided by the so-called ‘virial’, i.e. by the quantity 


which was introduced by Clausius in the kinetic theory of gases. 
For a motion restricted to a limited region, the time average of this 


quantity V is connected with the time average of the kinetic energy 

by the relation oT = J. (26) 

This is called the ‘virial theorem’. It can be derived as follows: We 
multiply Newton’s equations of motion 
d’x, | au 

Mee == be ’ etc., 
by the corresponding coordinates and write 
dx, at dx, dc,\? 
Th ae dt nt) (a) 


Adding these transformed equations, we get 


d dz, dx,\? _ — (aU 

4S maleate.) Smf(ta)'s.] = — > (eet) 
Formula (26) is then obtained by averaging with respect to the time 
and taking account of the fact that the mean value of 


d dx, 

ai > me(m + -) 
vanishes. If we replace the kinetic energy 7' by the difference W—U 
and assume that the potential energy is a homogeneous function of the 


$4 AVERAGE VALUES 31 
mth degree in the coordinates, formula (26) reduces to the form 
2(W—U) = nU or 

U = —.. W. (26 a) 

se! 

It can easily be shown that this relation remains exactly valid in wave 
mechanics if U is defined as the integral { Ujy* dV and y is defined 
as the exact solution of the corresponding Schrédinger equation. As 
an example we shall consider the simplest case of a one-dimensional 
wave-mechanical problem which is described by the equation 


d* re se m 
dx * 


2" (WU = 0. 


If we multiply this equation by x db*/dx and the conjugate equation 
2 
4 * = 3 (W—U)p* =: 0 by oo and add, we obtain 


dys dib* 82?m 1, = ‘ 
ae de) t pe Wage hb) — "Fe Une (d*) = 0. 
By partial integration with respect to x, taking into account the 
aati conditions (4 = 0 and dy/dx = 0 for x = +00), we get 


(dy ay ae = Ww a a i ns foe) a ans 
J dz dx 


e 
-2D 


or, since ‘7 yup* dx = 1 and ive dx = f, 


fife +e" w 202] = i 


Further, by multiplying the Schrédinger equation by %*, we obtain 


J yee de 4 om i: (W—U)ynb* dx = 0, 


—o 


Le. jeite 2 + T) = 0, 


or, transforming the first term by partial integration, 


(dp dyp* +e 
dx , aaa 


U) = 0. 


32 CLASSICAL MECHANICS AS LIMITING FORM §4 


We have therefore W—U+W -) = 0 
or 2(1W—U) = a 


This is exactly formula (26) for the special case that we have con- 
sidered. 

Another illustration of the connexion between the wave-mechanical 
and the classical theory is given by the similarity of the classical equa- 


tions of motion, 2x aU 


mae Ste Se, 

dt® ra ‘ 
and the wave-mechanical] relations 

az aU 

— = —..- ; a4 

m 7B Ae etc., (27) 

between the corresponding average (or probable) values of the quantities 
involved. 

The relations (27) were found by P. Ehrenfest. They are usually 
referred to, in connexion with the propagation of a wave packet, as the 
equations of motion of the ‘centre’ or ‘centroid’ of the latter, that is, 
of the point with the coordinates 


z= | xpp* ay, 9 = [ ybp* ay, z= [ app* dV. (27 2) 


If the wave function % represents a wave packet formed by superposing 
waves with slightly different frequencies (i.e. motions with slightly 
different energies), the coordinates Z, 7, Z are certain functions of the 
time (in the case of a stationary state where the dependence of % upon 
the time is specified by the factor e-**"” they reduce to constants), so 
that we can differentiate them with regard to the time. The corre- 
sponding quantities can be defined as the average values of the com- 
ponents of the velocity of the particle or its acceleration, etc. 

We shall prove the relations (27) for the simplest case of a motion 
parallel to the z-axis (the proof can easily be extended to the case of 
three-dimensional motion). We have, by the definition of z, 


a -f a0) ay - |= (v2) a, 


since z and ¢ are independent variables. 
t The proof given is due to B, Finkelstein. 


§4 AVERAGE VALUES 33 
Now ¢ and ¥* satisfy the equations 


ap 4(2M _wuy) 


at 4 m\ az? 
opt th anp* “ 
where p = sis Hence 


di th Pt yr — 9 ae 


ox? ox? 

By partial integration, in seca with the fact that 
+0 
| fae =N4-0)-s-«) 


vanishes if the function f contains x or dib/dz as a factor (since f yy* dx 


must be finite and equal to 1), we obtain 


+00 
dz oh Oh, Ob 
lat sab 3 : 27 
dt 4ami J (v Ox ¥ da ) - PP) 
This as aia could be obtained directly from the relation 
re ) +3 = 0 (which is a special case of (19)) and the formula 
j= in(HS— yee ) for the current density. Putting 7 = py*d(z), 


where #(xz) is the average velocity at the point z, we can rewrite the 
preceding equation in the form 


+00 
di 3 * 
T= | HW de, 
which agrees with the definition of dz/dt as the average value of the 


velocity of the particle irrespective of its position. 
By differentiating (27 b) with respect to the time, we obtain 


So ~s [(Se—vr¥) S(t) 4 
+n) 92 (0) 


3505.6 F 


34 CLASSICAL MECHANICS AS LIMITING FORM § 4 
+00 
h2 aut P amy ayb* 
822m? J ( ox? poe \e +(55- pur dx 


~ itis | [EE L)-w 20] 


= hey a ae 

=~ gata f ae 
i.e aS ie 
= dex’ 
where we Df — pp* dx 


is the average (or probable) value of the force acting on the particle. 
It must be emphasized that this value refers not to the average 
(or probable) position of the particle, determined by the centre of the 
packet (otherwise this centre would move exactly according to the 
classical mechanics), but to all possible positions. 

If the dimensions of the packet are very small (which means that 
the uncertainty in the estimation of the particle’s velocity is very large) 
the motion of its centre closely follows classical motion. This, however, 
persists only for a very short time, for the packet will spread, the rate 
of this spreading being the larger the smaller its original dimensions 
(i.c. the larger the original uncertainty in the velocity). 


5. Motion in a Limited Region; Quantum Conditions and Aver- 
age Values 


We shall now investigate the case of a (three-dimensional) motion 
restricted classically to a finite region of space (where W—U > 0), and 
derive the ‘quantization rules’ characteristic of such a motion with the 
help of the approximate wave-mechanical theory based on the classical 
determination of the phase or action function s(= S®°) by means of the 
Hamilton-Jacobi equation. A motion of this kind must obviously have 
a periodic or quasi-periodic character, so that the path described by 
the particle may fill up the whole region or pass many times in various 
directions through the same or nearly the same point (as, for instance, 
in the simple case of the oscillatory motion of a particle along a straight 
line). If the particle is replaced by a continuous assembly of its copies, 
a rather complicated picture results, different copies passing simul- 


§5 MOTION IN A LIMITED REGION $6 
taneously through the same point with velocities which are in general 
different both in regard to direction and (if the field of force varies 
with the time) in regard to magnitude. The latter must, of course, 
remain a single-valued function of the coordinates in the case of motion 
with a given (constant) value of the total energy W. The function 
¢ = s/m, which can be defined as the velocity potential, must, however, 
in this case (as well as in the general case of non-conservative motion) 
be a mulliple-valued function of the coordinates. Considering the copy 
assembly as a kind of fluid, we can illustrate the case in question by 
the familiar type of fluid motion with closed stream-lines, each stream- 
line representing the path of all the particles situated on it. In the 
associated wave picture these closed paths of the separate particles or 
copies must be interpreted as closed rays. 

Now a fluid motion of this type can be irrotational if, for instance, 
the fluid is flowing in a closed tube or around some closed tube. The 
velocity v of the particles, as a function of their coordinates, can then 
be represented as the gradient of a potential ¢, provided the latter is 
defined as a multiple-valued function of the coordinates. In fact, taking 
the integral of the velocity along a line o connecting two points P, and 
P,, then, since the projection v, of v on the line clement do is, by 
definition, equal to dd/do, we get 

r, 

| % do = $(P,)—4(F). 

B, 
If the line is closed, i.e. if the points P, and P, coincide, this integral 
should be equal to zero, irrespective of the shape of the line, unless we 
assume that for closed lines of certain type the potential ¢ may change 
after a ‘round trip’ by an amount A¢ equal to the value of the integral 
§ v, do taken along the corresponding closed line. If the latter coincides 
with a stream-line, the integral will certainly be different from zero, 
since along this line we must have v, = |v|. 

Now it can easily be proved that in the case of irrotational motion 
the integral $ v, do, which is called the ‘circulation’, will have the same 
value for all closed lines of the same family, i.e. of the same general type. 
In the case of a fluid flowing around a closed tube along closed stream- 
lines (Fig. 1), we must distinguish closed lines of two families: those 
which do not surround the tube, and those which do. For the former 
the circulation will be equal to zero, while for the latter it will have 
a certain value different from zero. This result follows from the trans- 
formation of the line integral $v, dco, by means of Stokes’s formula, 


36 CLASSICAL MECHANICS AS LIMITING FORM § 5 
into the integral ¢ (curlv), dS over any surface S limited by the line co. 
In the case of the lines of the first family the surface S will be situated 
entirely within the fluid, so that the integral will vanish, since the 
motion is supposed to be irrotational (curlv = 0). In the case of the 
lines of the second family the surface S will cut the tube around which 
the fluid is flowing. Since for points inside the tube the idea of velocity 
has no meaning, we can replace the surface S by another surface S’ 
bounded by two closed lines of the second family. Stokes’s formula 
applied to this surface which lies wholly within the fluid, and for which 
therefore the integral $ (curlv), dS vanishes, leads to the result that 


Fic. 1 


the integral § v, do taken over the double boundary of S’ must vanish 
if the ‘round trip’ is made in opposite directions along the two con- 
stituent lines, whence it follows that the circulation will have the same 
value for both lines if the round trip is made in the same direction. 

It may be mentioned that exactly similar results are met with in the 
theory of the magnetic field generated by a lincar electric current. This 
field—outside the wire along which the current is flowing—is also 
irrotational, so that the magnetic field strength can be defined as the 
gradient of a certain magnetic potential. With every trip around the 
wire along any closed line (encircling this wire only once) this potential 
must change by a definite value, namely 47i, where 7 is the strength 
of the current. 

The preceding results can be applied without substantial modification 
to the flow of the fictitious fluid represented by the copy assembly of 
a particle moving in a limited region. In the copy assembly, however, 
we must remember that different copies may be imagined to pass 
simultaneously through the same point in different directions. This is, 


§ 5 MOTION IN A LIMITED REGION 37 
of course, impossible in the case of real particles. In particular, closed 
stream-lincs may degenerate into ‘double lines’, i.c. unclosed lines along 
which the copies move first in one and then in the opposite direction 
(oscillatory motion).t The ‘circulation’ ¢ v, do for such a double line 
will not be equal to zero, but, on the contrary, will be equal to double 
the value of the integral { v,do for a one-way trip. As a result the 
velocity potential ¢ == s/m, in addition to the multiplicity considered 
above, may acquire a duplicity of an entirely different character, corre- 
sponding to the possible presence at each point of two copies moving 
in opposite or, in general, in different directions. 

Leaving aside this duplicity we see that, in the case of a particle 
confined to a finite region of space, the function s representing the 
mechanical action or the momentum-potential of the copies of this 
particle must—so long as the motion of these copies is supposed to be 
irrotational—be a multiple-valued function of the coordinates, i.e. it 
must change by a certain amount As for all closed lines (including 
double lines) of a certain family. It should be mentioned that ‘round 
trips’ along any of these lines have nothing to do with the actual 
motion, being performed not by definite copies (the latter need not 
move in closed lines), but by the process of linear integration referring 
to a definite instant of time. The change As of the function s for any 
such round trip is called a ‘periodicity modulus’ of s. From the point 
of view of the wave picture associated with the motion of the copy 
assembly of the particle these ‘periodicity moduli’ divided by the con- 
stant h represent the number of wave-lengths contained in the corre- 
sponding closed lines. In fact ds/do = g, is the component of the 
momentum of the particle along the line-element do and according to 
de Broglie’s relation d(s/h)/do = g,/h = k, must be equal to the corre- 
sponding component of the ‘wave-number vector’ k = g/h of the 
associated waves. The integral ¢ k, do = As/h may therefore be defined 
as the number of wave-lengths contained in the line o, or, more exactly, 
as the number of wave-crests cut by this line, or still more exactly, as 
the difference between the number of waves cut by o in the positive 
and in the negative direction (i.e. in the direction of propagation and in 
the opposite direction). 

Now it is clear that in the case of motion corresponding to a definite 
energy, the wave system associated with it must be such that the 
number of waves cut by any closed line should be integral, corresponding 


¢ The tube around which the fluid is supposed to flow degenerating into a ribbon 
with zero thickness. 


38 CLASSICAL MECHANICS AS LIMITING FORM §5 
to a change of the phase ¢ = 2z7s/h by an integral multiple of 27, 
a change which is irrelevant for the value of the wave function = Ae’?. 
In the contrary case the latter would also be a multiple-valued function 
of the coordinates, and would not represent a stationary system of 
standing waves (each standing wave being produced by the super- 
position of waves travelling in different directions), determined by the 
condition that the wave function y% should vanish at or near the 
boundary of the region where the particle is supposed to move. 

It thus follows from the condition of single-valuedness for the wave 
function that the ‘periodicity moduli’ of the ‘action function’ s must 
be integral multiples of h. 

This condition, which—it should be remembered—trefers to the case 
of motion confined to a (classically) limited region, can easily be shown 
to be equivalent to the quantum conditions of the old quantum theory 
discovered by Bohr and by Sommerfeld. 

For the gencral formulation of these quantum conditions, it is 
necessary, instead of the original rectangular coordinates x, y, z, to 
introduce new variables (generalized coordinates) q,, g2, 73. If we suc- 
ceed in so choosing these new variables that s assumes the form 

f 


§= 2 8a(Ya) (28) 


asl 


(‘separation variables’), then the quantum conditions run as follows: 
$ P.M, = $ ig dq, = (As), = n,h (n, aninteger). (28a) 
a 


Here the various p, (= ds,/dq,) are the ‘generalized momenta’ and 
(As), are the ‘principal moduli of periodicity’ of the function s, i.e. those 
alterations of this function which correspond to a ‘cyclic’ change of 
one of the separation coordinates when the remaining two are kept 
fixed. By a ‘cyclic’ change of the coordinate g, we mean an altera- 
tion such that the given particle returns to its original position and 
therefore the rectangular coordinates assume their original values. If 
the coordinate q, has the character of an angle so that the rectangular 
coordinates are periodic functions of it, then the ‘cyclic change’ of q, 
is simply the increase by the corresponding period Aq, (for example, 
27). Otherwise it is an oscillation of g, within certain limits determined 
by the nature of the field of force. The cyclic alterations of the in- 
dividual separation coordinates in the actual motion of the system 
take place in periods of time At, which are in general different from 
one another, so that the motion with regard to the time appears to be 


§5 MOTION IN A LIMITED REGION 39 
non-periodic or conditionally periodic. This dependence of the variables 
q7,, on the time plays no part in the ‘quantizing’ defined by formula (28 a). 

The generalized momenta appearing in (28a) can be defined, and 
indeed are usually defined, in a different way—namely, as the partial 
derivatives of the kinetic energy 7’, expressed as a function of the 
generalized coordinates and of the corresponding ‘velocities’ dq,/dt = q,, 
with respect to the latter. The equivalence of both definitions is obvious 
in the case of rectangular coordinates, since T = $m(v2+-v?+ v2) and 
J, == 0s/0x = CT /dv,, etc. If the coordinates are replaced by new 
(gencralized) coordinates q,(z, y,2), we have 


‘ a a a 


whence @q,/0v, == éq,/@x, etc. We thus get 


3 


6s > és Og, eT _ ‘ oF Mt. S OF Oy 


ox <q, Ox’ = BU, OGy Vz 1 Gq, 2% 
ds =— ss OT 
and consequently, — == = Py. 
Yn Ga 


The formulation of the quantum conditions in the form (28a) is some- 
times possible in two or more different ways—if there exist several 
sets of ‘separable’ coordinatcs. Theoretically it is possible—in a single 
way at least—for any type of motion (restricted to a finite region). 
Practically, however, the ‘separation coordinates’ can be found only 
for simple types of motion (i.e. of the field of force). If the separation 
coordinatcs cannot be found, then the quantum conditions—in the sense 
of Bohr’s theory—must be stated in the more general form indicated 
above, namely, that the moduli of periodicity of s with respect to any 
closed curve should be equal to an integral multiple of h (or to zero). 

We shall now turn to the question of the relation between the wave- 
mechanical average or probable value of any function of the coordinates 
of the particle for a given quantized state of motion and the corre- 
sponding classical ‘time average’ of this function. The solution of this 
question depends upon the introduction of new coordinates of a still 
more general kind than those considered above in connexion with the 
formulation of the quantum conditions. These still more general co- 
ordinates are not directly expressible in terms of the original ones, but 
in terms of the original coordinates and the corresponding momenta, 
the new momenta being also functions of the old momenta and of the 
old coordinates. 

Coordinate or rather coordinate-momenta transformations of this 


40 CLASSICAL MECHANICS AS LIMITING FORM §5 
type were introduced by Hamilton and are called contact or canonical 
transformations (the transformation considered above being a particular 
case of these transformations). 

The theory of canonical transformations is based upon the preserva- 
tion of the so-called ‘canonical form’ of the classical equations of 
motion. In the case of rectangular coordinates these canonical equa- 
tions can be obtained directly from the usual equations of motion 


m d*x/dt? = — 6U/dx, etc., and have the form 
dg. OH daz oH 
De ee eared SH ey rs 
dt ex’ dt 09, (29) 
where H =, (gitgitg)+U (29) 


is the total energy expressed as a function of the coordinates and 
momenta, and is usually denoted as the ‘Hamiltonian function’. The 
equations (29) can be interpreted as referring to a particle moving not 
in ordinary space with the three coordinates z, y, z but in the six- 
dimensional phase-space (Part I, Chap. V) with the ‘coordinates’ z, y, z, 
Ix» Jy Jz» the time derivatives of these coordinates representing the six 
components of the ‘velocity’ in phase-space and // being a function of 
the ‘position’ of the particle in the phase-space.f 

For the sake of uniformity in notation we shall, in the following, 
instead of x, y, z write Q,, Q2, Q3, and instead of g,, 9,, 9, write P,, P., Ps. 
The equations (29) then become 

oH dQ, 0H 

 -5o a ae (2%) 

We now introduce new coordinates Q', Q;, Q; determined by three 
equations of the form 


OQ = Op(@1, Ve Qs) or Ve = Va(Qi, 2,93) (a, B = 1,2,3). (30) 
We then define the new momenta P, P;, P by the formulae 


,_ 68 . as AQ. : 2Q5 
8 3g; = 230. aQp = 2 Page or P= > Phe > (308) 


which obviously do not assume a knowledge of the eee aren a 
It can then easily be shown that these new coordinates and momenta 
satisfy a system of equations of the same form as (29b), 
dP 0H’ dQz 0H’ 
Sean Sa ae § =) A: = 1,2, ’ 1 
dt a0, dt OP. (B (st) 
t Instead of one particle one can consider a continuous assembly of its copies, 


distributed not in the ordinary space as before, but in the phase-space with a density 
depending in general upon the time. 


85 MOTION IN A LIMITED REGION 41 
where H’ is the new Hamiltonian function which is obtained by re- 
placing in the original function H(Q,P) the old coordinates and 
momenta by the new, according to the formulae (30) and (30a). The 
transformation defined by these formulae is called a ‘point transforma- 
tion’. As already mentioned, it is a special case of the canonical 
transformations. A canonical transformation (of the coordinates and 
momenta) is defined by the formulae 


a 


ab oe. 
a = Qp = 6P,’ (31a) 


36, 
where 9(Q, P’) is a completely arbitrary function of the original co- 
ordinates and the new momenta. If, in particular, we put 


® = ¥ Phil Qs) 
we obtain, by (31a), 
Qs = falQ1, 929s); Py = p2 P, mig 


which corresponds to the point eet ee a). 

The fact that the original canonical equations (29) are transformed 
by (31a) into equations of the same canonical form (31) can be shown 
as follows: 

We form the complete differential or rather the variation of the 
function ®, corresponding to a virtual variation (completely indepen- 
dent of the actual motion) of the variables Q, P’: 


om op ’ y of 
= —. —— §P, = Fs) 5Pe, 
sD = 2 90,222 + > ap, Ps 2 Fa Cat 2% B 


and differentiate this expression with regard to the time. We also take 
the time derivative of ® 


a -> PGE DT 


and form its variation. By subtracting < expressions thus obtained, 
we get, remembering that 5 and dt are commutative, 


> (aree— “Et Pa.) = > (‘a8 89 — 206 sp), 
Now by (29b) we have 
es 2 8Q, — Fi BP.) = - Ge. 8Q, + P. p, 8P,) = —8H. 


a 
3595.6 G 


42 CLASSICAL MECHANICS AS LIMITING FORM § 6 
Hence, in virtue of HP, Q) =A (FP, 2’), 
we obtain 


- dP,» Qh np 
_8H -2 (eo, , 505 +05 p, Pa) = > (‘a 8Q; ant PA). 


Since the variations 4 and 5Pz are arbitrary, we can equate their 
coefficients. In this way we get equations (31). 

Those canonical transformations, in which the transformed Hamil- 
tonian H’ depends only on the momenta P’ and not on the coordinates 
Q’, play a special role. Such coordinates are usually called cyclic. The 
equations (31) reduce in this case to 
706 i = wg = const., 

i.e. Qp = as, 

If the transformation function ® leading to cyclic coordinates is 
known, the mechanical problem can be regarded as solved, for the 
original coordinates and momenta are then expressed according to the 
equations (31 a) as functions of the time which, besides ¢, only contain 
constants Pp, we, and dp- 

Now it follows from (31a) that this special transformation function 
is just the action function s regarded as a function of Q,, Q2, Q; and of 
three arbitrary constants P}, P3, P which necessarily appear on solving 
the Hamilton-Jacobi equation (16) or (17) by which this function is 
defined. These constants of integration can be expressed in terms 
of the three principal moduli of periodicity of the action function 
J, = (As), with regard toa system of separable coordinates q,, 92, 73 (which 
we need neither actually know nor consider in detail here). Replacing 
the original constants P, by their expressions in terms of J, J2, J, we 
can write the transformation function ® in the form s(x, y,z; Jj, J2,J3) 
and define the constants J,, as the new momenta (P;, = J,). Considered 
from this point of view these constants are called the ‘action variables’ 
of the problem. The corresponding cyclic coordinates are called the 
‘angle variables’. We shall denote them by wg (= Qs). 


We have therefore wg = wat+dz, (32) 
where according to the transformed canonical equations (31) 


Pp = const. 


oH’ _ 
wp = ig = const. (H’ = W) (32a) 
os a8 


§ 5 MOTION IN A LIMITED REGION 43 
To ascertain the dependence of the old coordinates Q, on the new 
coordinates wz, we shall introduce for a moment as an intermediste 
link between them the separation coordinates q,, 92, 93. Expressed as 
a function of the latter, the function s assumes the form 


3 
= > baler J; J, Js). 


To a cyclic alteration of the coordinate g, there corresponds by (32b) 
an alteration of the coordinate wg by A, wg == A, @8,/éJg. We have 
therefore, because A, 8g == J, if « = B, and -= Vif a 4 B, 

Agios et 

B (« # B). 

These formulae show that when any angle variable we is increased by 
1 and the remaining w’s are maintained constant, which corresponds 
to the cyclic alteration of the separation coordinate gg, i.e. to the 
return of the particle to the original position along a ‘f-curve’, then 
the action function s increases exactly by Jz. 

From this it follows that the coordinates Qg, and consequently the 
momenta Py, are periodic functions of the angle coordinates with periods 
equal to 1. Each of them, as well as any function f(Q,, Qs. Q3) (or still 
more generally f(Q, P)), can be expressed in the form of a triple Fourier 
series f= Ss Sick, ks et2athy 10, +k, 11+ ky wy) (33) 

ky, ky, les 
where k,,k,,k, are integers which can assume all values from —oo to 
+00, and f, ;,,x, are certain expansion coefficients characteristic of 
the function f. If instead of the wg we put their values obtained from 


(32), we get f= > Chey, Key hy CMR tha oa ths wo, (33 a) 
Ky ar Ky 


where the C,, are new expansion coefficients which we can regard as 
the amplitudes of various harmonic vibrations, while 

w = k,w,+k,w,+k, ws (33 b) 
are the frequencies of these vibrations. The quantitics wg, i.e. the 
velocities corresponding to the angle coordinates, represent therefore 
the fundamental frequencies of the motion. 

We can now return to the problem of determining the time mean 
value of f. This problem can be solved at once by means of formula 
(33a). Indeed, the required time mean value must obviously be equal 
to that amplitude coefficient in (33 a) for which the vibration frequency 
w vanishes—or the sum of such coefficients if the equation w = 0 is 
satisfied by several different combinations of the numbers 4, ko, ks. 


44 CLASSICAL MECHANICS AS LIMITING FORM §5 
This mean value can be represented on the one hand by the general 


T 
formula f = lim al fdt. On the other hand it can be represented 
T-—-2 
0 


just as well by the formula 
111 
F=[{{ fw, dw, dws (34) 


000 
which does not contain the time explicitly, the triple integration being 
extended over the ‘period cube’ in the coordinate space of the angle 
variables; f is given as a function of the angle variables by formula (33). 

The expression (34) has the form of a ‘statistical’ mean value corre- 
sponding to an averaging over the various copies of the given particle 
distributed with a constant density in the space of the angle coordinates 
W, We, Wz. Its numerical agreement with the time mean value of f for 
a definite copy means that the curve described by the motion of such a 
copy fills up this space uniformly.f - 

We can now return from the angle coordinates to our original rect- 
angular coordinates Q, = z, Q. = y, Q, = z. In view of the fact that 
the new momenta are constants, the old coordinates may be considered 
practically as functions of the new coordinates alone, and vice versa. 
We can thus transform the volume integral (34) according to the 
well-known theorem of Jacobi, and put 


f= i fD av, (34a) 


where dV = dzdydz and 
dw, Ow, Uw, 


Oz’ dy’ =e 
Dale We Fy} 
ax’ Oy’ @ 
a Oe 
ax’ dy: & 
By (32b) this functional determinant can be written in the form 
5 a 
AJ, ax’ add, dy’ aJ, ez 
a8 a%s as 
Pao. 2: EL 
BJ,dn’ Bby’ Bd,be SA) 
a8 a8 08 


aT,ou’ BT,y’ Bd,02 
¢ This condition is satisfied for non-degenerate motion, that is, motion for which the 
three fundamental frequencies w,, w,, ws are not commensurable with each other. 


§5 MOTION IN A LIMITED REGION 45 
The volume integration in (34a) must be extended over the whole 
region for which W—U >0. We arc thus brought to the conclusion 
that the relative probability that the particle will be found in the 
volume-element dV, as measured by the relative duration of its presence 
in this volume-clement, is equal to D (f DdV = 1). Comparing this 
result with the wave-mechanical average 


f= | fpr av, 


we see that it will agree approximately with (34a) if yxb* = D. Now 
in the region W—U > Othe function s == S° is real, so that the modulus 
of the function p = Aet®7lh — ef2S°/h+5" must reduce to A = e*. It 
follows therefore that A? =D. 


It should be remembered that an exact agreement between the classical 
and the wave-mechanical mean value is out of the question--- not only 
because of the approximative character of the preceding expre. sion for 
yw (with s determined from the Hamilton-Jacobi equation), vut also 
because in the wave-mcchanical case the integration must be extended 
over all space including the classically forbidden region. However, this 
region, although infinite, contributes in general only a finite and usually 
a small amount to the integral { f~y* dV because of a very rapid 
decrease of the function |;p/?. 

The relation A* = D can of course be derived in a straightforward 
way by integrating the equation 

div A?Vs = 0 
[cf. (17a) ], or the equation 
V2S8°9+ 2V89-VS’ = 0 


to which (21b) is reduced in the case of conservative motion. This 
integration has been carried out (in the case of the second equation) 
by Van Vieck, who showed that A? must be proportional to the deter- 
minant 6s as = (8 

anda dyda d2da 

as a3 8 

arep oyop 2206) 

as 3 0% 

dxdy dydy dzidy 
where a, 8, y are any three integration constants occurring in the 
expression of the function s(z,y,z;«,8,y). This determinant is equal 


46 CLASSICAL MECHANICS AS LIMITING FORM § 5 
to the product of D with the determinant AaB, y) which is a con- 
a}, J», J3) 
stant factor playing the role of a normalization constant. 
In the special case of uni-dimensional motion the determinant (34 b) 
reduces to 6?s/dxd0J, whereas by direct integration we obtained, in this 


¢ , [08 
case, A? = — = mC*/— . Thus we must have 
v ox 
aS. == mC Ad 
oxdd Ga 
08 0 (08 Oo 1fés\? 
that is, Be ON sn OO mC?, 
dx dd (=) ed (se) 


a 


2m \ex 
is actually fulfilled, for (U/@J = 0 and dW/éJ = w = 1/7’, where T is 
the period of motion [according to (32a) with }¥ = //’|. Hence we 
get C? = 1/T in accordance with the simple theory developed in the 
preceding section. 


: 1 (ds\? rr ee ts on 
or since — =) = W—U, we get aM —U) = C*. This condition 


II 
OPERATORS 


6. Operational Form of Schrédinger’s Equation, and Opera- 
tional Representation of Physical Quantities 


The formal relation between classical mechanics and wave mechanics 
can be presented in another way which not only leads us to a deeper 
understanding of the theory but also to various important generaliza- 
tions. 

We can arrive at this relation by cxamining Schrédinger’s equation 
(12) written in the form Pid, 


where D denotes the operator 
_ fh a\? (h ay? (kh ay]. kh a 
om a (esas) “tl seroy) laces) (Pave 


This can be expressed in terms of the elementary differential operators 


age he D a Fo (35) 
Onida 2 oaidy PY Bidz P? sma? 
by the formula 1 
D= = (Pit PytpAt+pt U. (35 a) 


The equation Dé = 0 thus reduces to the classical equation 
T+U—W=0 
if we replace the operators p,, p,, p, by the components of the momen- 
tum, and —7, by the total energy, i.e. if instead of (35) we put 
P2=%9  Py=Iy P2=9n =Bm=—W (36) 
and cancel the function ¥% (considering it as a factor). Therefore the 
transition from classical mechanics to wave mechanics can formally be 
carried out as follows. In the ‘classical’ equation 


1 
— (92+95+92)+U—W = 0, (36 a) 


which relates the components of the momentum and the total energy of 
a particle, we must replace these quantities by the elementary operators 
(35) and then multiply the Schrédinger operator D thus obtained by 
the wave function ¢ on the right, where ‘right multiplication’ simply 
means applying the operator to the expression standing on its right. 


The replacement of the energy W by the operator —p, = — *. 5 has 


48 OPERATORS §6 
been made before, although in a somewhat different connexion, namely, 
in the transition from the wave equation 


Vp + ow U)p = 0 
for a conservative motion to the general equation 
82? h @ 

v2 Be ae sh Bh 
re ad Qi at U\i 
which applies to a motion of any kind. In the former case, since 
yb = P(x, y,z)e-27Fh, the operator p, is actually equivalent to the 
energy in that it satisfies the equation p, = — Wy, which we could write 
symbolically (dropping the function operated upon) in the form p, = — W. 
A similar equivalence exists between the operators p,, p,, p, and the 


components of the momentum g,, g,, g, with respect to the wave 


function yb = const. ef270=2+0y7+0:2-WOIh, 


representing the free motion of a particle with a velocity of specified 
magnitude and direction. As we know, the latter can be specified only 
in this particular case. In the general case the functions p,¥, p, , pp, 
-——p,y are not equal to the products of the function y by constant 
numbers. 

It is natural to associate this result with the fact that, in the general 
case, the components g,, g,, g, of the momentum, as well as the energy 
W, cannot be defined as certain numbers since they do not have 
definite values, and to assume further that the operators p,, p,, p,, 
—p, by which they are replaced in the transition from classical to wave 
mechanics must replace them in all wave-mechanical questions. 

This principle is corroborated by the following considerations. 

(1) If the wave function % can be approximated to by the expression 
e?27Sih where S is the classical ‘action’, ic. the momentum-potential 
determined by the Hamilton-Jacobi equation, then we have 


h a — et2nSih — — ore as 


Sai Ox = 924 


Pz = 


etc., so that in this approximation the operators p,, p,, p, are actually 
equivalent to the components of the momentum g,, g,, g,. This result 
still holds approximately if % is represented in the form Ae7#* where 
8 is the classical momentum-potential, for the partial derivatives of the 
amplitude A with regard to z, y, z (so far as the above approximation 
can be applied) are very small compared with the partial derivatives 


§ 6 OPERATIONAL FORM OF SCHRODINGER’S EQUATION 49 
of s/h, i.e. the components of the wave number (the wave-length being 
supposed to be very small). 

(2) If the function ¢ is ‘quadratically integrable’, i.e. if it can be 
normalized in such a way that the integral { yb* dV is equal to 1, then 
the integrals 


[v*p.eaV, [yrr,pav, [opp aV 


coincide with the average values of the components of the momentum 
as defined by the integrals 


m | j, aV, m { j, av, m | j,4V, 
where j = ¢*v is the probability current density and V is the average 
velocity introduced in the preceding chapter, §§ 2 and 3. We have in 
fact, according to the definition of 7,, 


m{j.V=_ (veo) av. 


4ni 


Now by ee integration we get 
ices ge. aie fé (y*) dV — { yr av = -{ yay, 


since in order that { ¥t* dV should have a finite value the function yb* 
must vanish at infinity rapidly enough to make the integral 


{ Zooey av = ff towergcts aude 
vanish too. Therefore 


m {iz iY = miles dV = { vep.war. 


The preceding results can be extended to the more complicated 
operators, by which different classical quantities represented as certain 
functions of the coordinates and momenta F(z, y,2;97,9y,gz) must be 
replaced, when g,, g,, g, are replaced by the operators p,, p,, p,. The 
simplest example of such a complicated operator is the operator 

= (p2+-p2+-p?2)/(2m) representing the kinetic energy. If the func- 
tion y describes a motion with a given constant value of the total 
energy, i.e. if it satisfies the Schrédinger equation (7-4 U—W) = 0, 
then we have 7's = (W—U)y, where the ‘operator’ (W— U) is a simple 
factor. The preceding equation expresses the fact that the kinetic energy 
(i.e. the magnitude of the classical velocity) is a definite function of the 
coordinates. The sum of the operator 7' and the potential energy U 

8595.6 H 


50 OPERATORS § 6 
represents the total energy of the particle and is usually called the 
energy operator, or the Hamiltonian operator, or simply the ‘Hamil- 
tonian’. Denoting this operator by H, we can write the preceding 
equation in the form Hy = Wy. It expresses the fact that the energy 
of the particle in the motion described by the function % has a definite 
value, namely, W. The general equation referring to a non-conservative 
motion can be written in the form 


(H-+p)p = 0. (37) 


It implies a certain relation between the two operators H and —p,, 
both of which represent the energy W (when it exists)—the former in 
a specific way, including the properties of the particle (mass) and the 
character of the field of force in which it moves, and the latter in a 
perfectly general way independent of these characteristics. 

Independently of the form of the operator F(x, y,2;p,, Py) P,); it can 
easily be shown that the result of applying it to the function # ex- 
pressed in the approximate form e'27S/* (or Ae‘274#*) is equal approxi- 
mately to the product F(z,y,2z;9,,9,,9.)%. The same is true in the 
more general case of an operator containing the time ¢ and the time 
derivative operator p,. We have namely 


F(x, Y,2,t; Py, Py Per Pp = F(z, y, 2,5 925 ys Jer — Wy, 


if the energy W is defined as —2@S/dt, in accordance with the Hamilton- 
Jacobi equation which gives —dS/at = (VS)?/2m4+U = 7T+U. The 
function Fy resulting from the application of the operator F to the 
exact wave function % can be represented as the product of the latter 
with a certain function K, of the coordinates alone (and eventually of 
the time). The function K, = (Fy)/ can be defined as the value of the 
‘quantity represented by the operator F at the corresponding point (and 
instant of time). This is precisely the way in which we have defined 
above the value of the kinetic energy in the case of a conservative 
motion. If, in particular, the ratio (F)/% is equal to a constant C, 
then the quantity represented by F is said to be a constant of the motion, 
its value C being independent of the position of the particle (and of 
the time). This case can be illustrated by applying the energy operator 
H to a function % which describes a conservative motion, or by applying 
any one of the operators p,, p,, p, to the function y which describes 
a uniform rectilinear motion. 

If the ratio F, = (Fy)/b is not equal to a constant, then we can 
define the average or probable value of the quantity represented by 


§ 6 OPERATIONAL FORM OF SCHRODINGER’S EQUATION 51 
the operator F by means of the formula 


F= Fo yp* dV 
or F = y*Fy dV, (38) 
with the condition that | p* dV = 1. (38a) 


This definition of an average value is a generalization of that already 
considered in the preceding chapter in connexion with quantities de- 
pending on the coordinates alone (such as the potential energy). Its 
physical significance has been tested above in the case of the funda- 
mental operators p,, P,, Dz. 

As a further illustration of the operational representation of physical 
quantities we shall consider the angular momentum of a particle, for 
instance, the angular momentum of an electron moving about a fixed 
nucleus (cf. Part I, § 14). In classical mechanics this quantity is defined 
as a vector with the components 

YG2— Jy. Jr—Tz, TWy—YIJz- 
We shall define it accordingly as a vector-operator M with the com- 


ponents 
M, == YP,—APy M, = 2Pzp—-ZPz, M, = IPy—YPx 
or h Q 2 6) _ h 7) é 
am vail¥e—*ae) a= sail" “2 
(39) 

h a 7 
M, = .—(%#— -—y— 
“  — Qar\ ey on. 


Transforming from rectangular coordinates to spherical coordinates by 
means of the formulae 


x = rsin@ cos ¢q, y = rsin 6sing, z= rcosé, 
ie. re = rsin cos ¢< + rein Osin d= + reos 8 
== re tus +t a 
and likewise 
a = —rsin8sing © +rsinBeos$.¢ = ae 


h @ 
We have therefore M,= oni ap" (39a) 


52 OPERATORS §6 
Further, from (39) we get 
J MEI 1 


h? 7) 2) 7) 
~ jal 8 pat? 5 oe a9? a) ava) te] 
o? rs) 
Se coe plw+S —— a“ — 2x — | 
h oe Fa) 
= -Ale-e, — 2yz ie te a 


where the terms denoted by ... are obtained from the given terms by 
cyclic permutation of the coordinates z, . z. Because of the identity 


we tye p22) matey tee tt Oye E+ 
Ox Yay az} Y iyoz 
oe a o\? 4 
eee eee By) yy ene = fr—}) —r— 
or x gat Ta ("5 ro 


we can write the previous expression in the form 


Rey (a a at a\s a 
2— ne pane (Seaton) eee, ate eer ee 
aa" ‘te (atmtal (r =) rs 


— [age é 
= zl" V2—¢r me ar. 
Hence 
4n2 1 1/.a\? fa 4n? M2 20 
U8 oe Mt (ro) 4-(2) = —S 
h? 7? +2(r3) +2(5] h? 73 +etiy 
‘ a 20,1 
to a oe Ae 
or putting Vv gee ap ta ‘ 
lL OY. »@ 1 2 
2 —— — em nem emacs 
a of ap 0 (sin 5) T sint a6 
denotes the angular part of V?, we get 
h2 
fo 
M ant ; (39 b) 


By applying this operator and the operator (39a) to the functions 
Prim = Flr) Yim(9, $), which specify the stationary states of a hydrogen- 
like atom, we get 


h2 
M *Patm = F,(r)M a) a a a a,3% nl 0) (i 
and by the equation 0?Y,,,+J(/+1)¥,,, = 0 we get 
h2 
Mb ia gal atm: (40) 


§6 OPERATIONAL FORM OF SCHRODINGER'S EQUATION 53 


Since, further, the dependence of Y,,,(6,¢) upon ¢ is expressed by the 
factor e*™¢, 


hm \ 
Pf bn1m = = Prim: (“9a} 


These relations show that the magnitude of the angular momentum as 
well as its direction are constants of the motion—just as in the classical 
theory of a particle moving in a central field of force. It should be 
mentioned that the character of the central field affects only the radial 
factor F,,(r) in the wave function y¥,,,,,, the angular factor Y,,,(8, 4) 
being in all cases a spherical harmonic function. Therefore the above 
relations hold for the motion of a particle not only in a Coulomb field 
but in any central field of force. They show further that the quantum 
numbers J and m which have been introduced in Part I, § 14, as 
nodal numbers, characterizing the wave function y,,,,, from a purely 
geometrical point of view, have also a dynamical meaning, one of them 
(!) determining the total magnitude of the angular momentum according 
to the relation M* = I(l+1)h?/4n?, and the other (m) determining the 
projection of the angular momentum upon the z-axis according to 
M, = mh/2zn. For this reason the numbers 1 and m will be called re- 
spectively the angular and the axial quantum numbers.{ The constancy 
of the direction of the angular momentum is only proved indirectly by 
the relation (40a) because the direction of the z-axis can be chosen 
arbitrarily, the functions y,,,, being so defined that the z-axis is the 
axis of the sphcrical harmonic functivns Y,,,(8,¢) = P,,,(6)e'™?. If we 
apply the operators M, and M, to these functions the result will not 
be similar to that obtained by applying the operator M, because the 
functions M, y¥,,,, and AL, Prim are not equal to multiples of ¥,,,. Since 
we know that J, and M, also represent constants of the motion, we 
see that the condition Fy = const. cannot be regarded as the gencral 
criterion for the constancy of the quantity represented by the operator 
F. It can easily be shown that the above failure of this equation to 
express the gencral condition of dynamical constancy is connected with 
degeneracy, i.e. with the fact that the functions y,,,,, are not determined 
by the value of the energy W,, which, in fact, depends only on the 
‘principal’ quantum number (n). Any linear combination of the n?* 
functions y,,,,, Which differ from one another by the values assigned 
to the numbers / and m, will also represent a stationary state belonging 
to the same value of the energy. This linear combination, i.e. the 


+ This seoms preferable to tho traditional denomination where ! is referred to as the 
‘azimuthal’ quantum number and m as tho ‘magnetic’ quantum number. 


54 OPERATORS $6 
coefficients C;,, in the sum p3 > Cin¥ nim» Can be so chosen that the 
™ 


resulting function ys, will represent the same thing with respect to the 
x-axis as ,,, With respect to the z-axis. Applied to this function the 
operator M, wouid be equivalent to multiplication by m’h/27 accord- 
ing to the equation M,y,, = (hm’/27)f;, which could be considered 
as a direct expression of the constancy of M,. The function obtained 
by applying M, to y,,,, can easily be shown to reduce to a linear com- 


+0 
bination ¥ Cin of the 2/+1 functions y,,,, associated with the 
m'=-l 
2-OXis. 


7. Characteristic Functions and Values of Operators; Opera- 
tional Equations; Constants of the Motion 

In general the equation Fy = const. can only be satisfied by functions 
ys of a special type which depend upon the nature of the operator F 
and are therefore called the characteristic functions of this operator 
(‘Eigenfunktionen’ of the German authors—often translated into 
English as ‘proper functions’). The corresponding values of the constant 
factor are called the characteristic values of F. As an example we may 
take Schrédinger’s equation Hy = Wy. In this equation the wave 
functions describing the stationary states of motion are the charac- 
teristic functions of the energy operator H, and the energy-levels 
W are its characteristic values. In the case of H, as well as in the 
case of any other operator, these values and the functions associated 
with them can form both a discrete and a continuous set. The 
characteristic functions are fully determined by an operator F for a 
one-dimensional problem, involving one coordinate only. In three- 
dimensional problems there remains in general a certain ambiguity 
in the choice of the functions ¢%, as determined by a single equa- 
tion of the type Fy = const. %, an ambiguity which is known as ‘de- 
generacy’ if F is the energy operator H. Thus, forexample, the operator 
h a 

3 Qni ad 
with regard to their dependence upon m, defining them as % = f(r, 0)e'™? 
where f(r, 6) is an arbitrary function of r and 6. The operator M? like- 
wise determines the dependence of the characteristic functions on the 
angles 6, ¢ only, the equation M%p = const.y being satisfied by 
y = f(r)¥,(0,¢) where f(r) is an arbitrary function of 7, and Y,(0, ¢) is an 
arbitrary spherical harmonic of order /, which can be expressed as a sum 
of 21+1 functions of the type FA,,,(9)e"? with arbitrary coefficients. 


specifies the corresponding characteristic functions only 


§7 CHARACTERISTIC FUNCTIONS AND VALUES OF OPERATORS 55 
Now we have also seen that Schrédinger’s equation Hy = const. in 
the case of a hydrogen-like atom has for each characteristic value of 
H = W,,a solution of the form y,, = f,,(r) ¥(0,¢), where Y(8,¢) is asum 
of n* spherical harmonic functions of the type F,,,(0)e*"? with arbitrary 
coefficients (J == 0,1,...,.n—1;m = —l,...,4+1). We cannot therefore 
completely specify the functions y%,,,, describing the stationary states 
of a hydrogen atom by taking one of the three equations 
Hy = const. x, M*f = const. x, M, = const.y, (41) 
but only by taking all three equations together. The functions jj, 
then appear as the ‘simultaneous characteristic: functions’ of the 
operators H, M?, and M,, each of thesc functions belonging to a ‘triplet’ 
of characteristic values W,, (M?), = l(l+-1)h?/4x?, and (M,),, = mh/2r. 
Another simple example of this relationship is provided by the 
operators p,, p,, p,. The characteristic functions of these operators are 
obviously f(y, z)el”7h, f(z, xeorulh, f(x, y)era; f,, f,, f, being 
arbitrary functions of the corresponding arguments. Taken together 
the three equations 


P24 = 9,4, Py = gy¥, p. = get, (41a) 
where g,, 9,, 9, are constants, specify unambiguously the function 
ob = const, fst +0, +022) (4) b) 


which describes the uniform rectilinear motion of a particle with the 
momentum components g,, 9,, g,, and which is a particular solution of 
Schrédinger’s equation Hp = Wy with H = (p3+p}+-p?)/2m, i.e. with 
U = 0, corresponding to free motion. 

It should be mentioned that the expression (41 b) for y is still incom- 
plete (as well as the expression ¢ = f,,(r)Y,,,(8,¢) for the hydrogen-like 
atom functions) inasmuch as it does not contain the time. The latter 
can be introduced by the additional relation 


—Py$ = Wy, 
giving fb ~ e-2"74h, The constant W is, however, not independent, but 
is connected with g,, g,, 9, by the relation W = (93+ 9),+-g2)/2m. 

If F is an ordinary function of the coordinates (or of the time too) 
which does not contain the elementary differential operators p,, p,, D., 
then the equation Fis = const. has no solutions of the ordinary con- 
tinuous type. The only possible solutions—except the trivial one f = 0 
—are those for which the function ¢ is different from zero on the surface 
F = const. and vanishes outside this surface (which can be displaced 
by varying arbitrarily the value of the constant). 


56 OPERATORS §7 

Another interesting case is provided by operators which satisfy 
the equation Fy = Cy identically, i.e. irrespective of the choice of 
the function ¥, and therefore do not determine this function at all. 
F = p,x—xp, is the simplest example of such an operator. Applying 
it to some function ys, we get 

a 

ee A 
Thus we see that this operator has one single characteristic value 
C = h/2ni with which any function can be associated as a ‘charac- 
teristic function’. The preceding equation can be written symbolically 


in the form h 
P2t—2Pz = 5-9 (42) 


. p. 


Qari 


which is obtained by omitting the arbitrary function y% to which the 
left- and right-hand sides of this equation must be applied. We have, 
of course, similar equations for the two other coordinates and the corre- 
sponding components of the momentum-operator: p,y—yp, = h/2m 
and p,z—zp, = h/2mi. In addition we have the ‘operational’ equations 
PrY¥—YP, = Oorp,y = yp,, etc., which express the fact that the order 
in which the operators p, and y are applied to any function f(z, y, z) 
is immaterial (since x and y are independent variables). The equations 
PzPy—Py Pz = 0 are quite similar to the equations xy— yx = 0 express- 
ing the commutative law of ordinary multiplication. Two operators 
F and G which, when applied successively in the order F, G to any 
function y give the same result as when applied in the opposite order 
G, F, are said to be commutable. This property is expressed symbolically 
by the operational equation 


G = GF, (42 a) 
which means that the ordinary equation 
FG), = GFYy 


is satisfied identically, i.e. for any function y. 

In general, the fact that the equation Ay = By is satisfied identically 
with respect to the function ¥, A and B being two outwardly different 
operators, is expressed symbolically by the equation A = B. We shall 
now give a few examples of such operational equations. 

Let us consider first of all the operator F = p,f—fp, where f(z, y,z) 
is an arbitrary (continuous) function of the coordinates. Applying it to 
an arbitrary function y¥, we get 


si das silee (JP) iz \- Qari ae 


§7 CHARACTERISTIC FUNCTIONS AND VALUES OF OPERATORS 57 


h of 
so that aa ee 
which means that the operator p,f—fp, is equivalent to the multiplier 
eI 
Qari ox 
The preceding equation is often written in the form 
of 
5p Pod), aoa) 
where the bracket expression on the right side is defined by 
2nt 
[Pol] =") (p.4—SP.)- (43 b) 


If, in the above definition of F', we replace f by x and p, by p® [which 
means differentiation of the nth order with regard to x, combined with 
a multiplication by (h/272)"], we get 


Te h n oe it an i). h v gr-l Z h oats 
we ise) in) rs f= ( ) MageaY = 9 gMhe es 


ean Yri} — axtnt Dar 
n n h n-1 
so that prr—xpy = 5. np), (44) 
270 


which can be rewritten symbolically in the form 
hk @ » 
Qari dp, t 
This formula can easily be generalized for any operator expressible as 
the sum of terms a, p? with coefficients a, which do not depend upon 
the coordinate z. Denoting this operator by f(p,.P,,P25 4,2), we get 


af—fx = Sees (4448) 


Qi Op, 
an equation very similar to (43) with z playing the role of —p,, and 
p, the role of xz. Putting 


(x,f] = 72 (ef—fe) (44) 


we can consider the equation 


n ar aaa 
Lpr—p .t = 


of ; 
Soe 44c, 
op, [x, 5 ] ( Cc 
as the general definition of the operator 0/ép,. We shall write in general 
[F,@] = ="(ra—Gr), (45) 


this ‘bracket expression’ introduced by Dirac as the quantum analogue 
of the Poisson brackets vanishing if the operators F and G commute 


with one another. 
3695.6 I 


58 OPERATORS §7 

It should be noticed that an operational equation A = B expresses 
the identity of the physical quantities represented by the operators 
A and 8B; the existence of such equations indicates that the same 
physical quantity can be represented in wave mechanics in a number 
of apparently different ways. 

Another interesting and important illustration of operational equa- 
tions is provided by the representation of the angular momentum of 
a particle. 

From the definition (39) it follows that 


Mi == (yp,—zp,)? oF (ypz)?— (yP.)(2Py)— (zp,)(yp-) + (zp,)* 
= YP PEP) YP yP-2—DPz Py; 
since p, commutes with z and p., and p, commutes with y and p,. Taking 
into account the relations p,z = zp,+h/2ni and p,y = yp,+h/2ni, 
we get h 
Mz = YP 2Py— 2YePy P2— 9” (YPyt Ps)» 
whence the formula (39b) can easily be obtained. We have in addition 
M, M, = (YP.—zp,)(2P,—xp,) = YP;*P,—ZPyz=Pzr— YP; XP,-+ ePy tp, 
= YPzP,2—2PyPy—Y2P; + 22P y Pz» 
whence 
M, M,—4, M, = YP SP; z+2xpy P,—XP, Pp, 2—2YP,P:, 


h h 
= (YP.—2XPy)(P;Z—ZP,) == Sori UP 2 *Py) = Qi M.. 
Thus, according to (45), 
M,, M,| = — M,,. (45a) 
In a similar way we can derive the relations [M/,, M,] = --M, and 


[M,,M,| = —M,, which can also be obtained from (45a) by a cyclic 
permutation of the indices z, y, z. These three relations can be replaced 


by the symbolic vector equation 
MxM = —-.m, (45b) 
270 


where A XB is defined in the usual way as the vector product of A 
and B. 

Interesting results are obtained by calculating the bracket expres- 
sions for the components of the vector M on the one hand, and the 
components of the vector r(z,y,z) or P(p,,p,,P,) on the other. We 
shall not go into these calculations (which can easily be carried out by 


§7 CHARACTERISTIC FUNCTIONS AND VALUES OF OPERATORS 59 
the reader) but shall merely notice the following results: 

[p?, M] = 0, [p?, M?] == 0, ( 46) 
where p? = 75++p),+-p:, the first of these equations being equivalent 
to the three equations [p?, M,] = 0, [p?, M,] = 0, [p*, M_] = 0. These 
equations express the fact that the angular momentum of a particle 
commutes with its kinetic energy 7' = p?/2m (more exactly we should 
speak of the operators representing the angular momentum and the 
kinetic energy). If the potential energy U is a function of the distance 
r = J{x?+y?+2*} alone (which corresponds to a central field of force), 
then we also have 

[U,M] = 0, [U, M?] = 0, (46 a) 
and consequently 

[H, M] = 0, [H, M?] = 0, (46 b) 
where H = p?/2m-+U is the Hamiltonian operator representing the 
total energy of the particle. 

The relations (46b) can be obtained very simply by using polar 
coordinates to represent H and M. Then 


h 2 a s 
H= s(sei) lzats = reer 12 |+u (r), 
h o h? 
{_= —.—, i= 2" OF 
Ini ag ane? 
and so 
] h \3 1 2 @ ¥ ( y 1 ea 
[H, M.] = oral ses) 3|9 ‘al [H, M7] =, -(.°.) [02,04], 


both bracket expressions [{?, 2/04] and [Q?,?] obviously vanishing.t 
The equations (46b) must be naturally related to the fact that M 

and M? represent quantities which are constants of the motion (in the 

case of a radially symmetrical field of force). An equation of the type 


[H, F] = 0, (47) 


ic. the commutability of an operator F with the energy operator H, 
can actually be considered as the most general expression of the fact that 
F represents a constant of the motion determined by the operator H, 
i.e. by Schrédinger’s equation Hy = Wy. 

In fact, applying the operator F to both sides of this equation, we 
have FH = WFy or, if HF = FH, we obtain H( Fy) = W(Fyp). This 
shows that the function Fy satisfies the same equation as the function 


t In order to obtain (46 a) without the use of polar coordinates we need only notice 


that [U, M,] = [U, yp,—zp,] = yLlU, p,)—2[U, p,) = = yon according to (43a). 


60 OPERATORS §7 
y with the same characteristic value of the energy operator H. If there 
is no degeneracy, i.e. if there is but one function % associated with the 
characteristic value W, then Fy can differ from by a constant factor 
only (which is immaterial so far as the equation Hy = Wy is con- 
cerned). Thus in this case we get Fy = const. ys, which is the original 
condition for the constancy of the quantity represented by F in the 
motion described by y%. In the general case, i.e. when there is de- 
generacy, the function Fy must obviously be equal to a linear com- 
bination of all the functions y¥,, y,,..., ~, associated with the same 
characteristic value of H, i.e. satisfying the equation Hy, = Wy, 
(k == 1,2,...,r), with the same value of the energy. Applying F to one 
of these functions we thus get, if FH = HF, 


Fy, = Sew hh, (47 a) 


where c,, are constant numbers, the matrix 


“, Gs - Gp 
os Cor 
Cry Ong : 7 , Cry 


replacing the single constant C of the non-degenerate case. 

The fact that the equations (47a) actually express the constancy of 
F can be proved by reducing them to a system of the standard form 

Pp, = Cans (47 b) 

where w,, (n == 1,2,...,r) are a set of 7 new characteristic functions of 
H belonging to the same energy-level W as the original functions 
y,,...,, and therefore equal to certain linear combinations of the latter. 
In order to determine them, we shall first consider the inverse trans- 
formation, i.e. we shall express the original functions as linear com- 
binations of the new ones by means of the formulae 


te = 3 init (48) 


If these expressions are substituted in equations (47a), then, in con- 
junction with (47 b), we get 


p3 Ann Cn Yn = p2 p Ce Un Pp 


Equating the coefficients of the same ¥/, and dropping the index n, 


we get 7 
>> Cy, = C'Ay (& = 1,2,...,7). (48 a) 


§7 CHARACTERISTIC FUNCTIONS AND VALUES OF OPERATORS 61 
This is a system of r lincar homogeneous equations for the determina 
tion both of the transformation coefficients a and of the characteristic 
values c’. The compatibility condition for equations (48 a) 


|Cy—C", Cle Cir | 
Car) Cog—C" Cop | ! 

a 

Cr» Cre oy eee ie Crp—O | 


gives r (in gencral different) values for the unknown c’, and to each 
of these values c;, there belongs a definite set of coefficients a,, namely, 
Qin» F,,+--, An. By solving equations (48) with respect to the $/, we 
can obtain the explicit expressions for the new functions in terms of 
the original ones. 

Summing up the preceding results, we can say that the condition 
[H, F] = 0 expresses the constancy of F with respect to all such types 
of motion as are described by functions ¢% satisfying simult.:neously 
the equations // = const. and Fy .= const.y. The functions & are 
thus simultaneously the characteristic functions of both H and .' 

So far we have regarded the energy as the queen of all the ope.. *-rs, 
but the above considerations seem to banish the energy from this 
supreme position and to reduce the Schrédinger equation // = const. % 
to the same humble role as that of any other cquation Fy == const. ys 
for the characteristic functions and values of any other operator F. 
Provided the operator F has a dynamical meaning, its characteristic 
functions will describe the motion just as well as the Schrédinger wave 
functions although perhaps less complctely and from a different point 
of view. The product %* will represent the probability of finding the 
particle in the volume-element dV even if % is a characteristic function 
of some operator F different from the cnergy without being simul- 
taneously a characteristic function of the latter. The above-mentioned 
difference in the point of view is obviously as follows: if ¥ is the charac- 
teristic function of Schrédinger’s wave equation, then ¥* dV measures 
the probability of finding the particle in the volume-element dV with 
a specified energy W (the characteristic value of H associated with y); 
if % is the characteristic function of some other operator F, then yb* dV 
measures the probability of finding the particle in the volume-element 
dV with a specified value of the quantity represented by F. 

The fact that the probability determined by some ‘wave function’ 
% has a conditional character only, dependent upon the assumption of 
@ certain specified value for the quantity or quantities by which (or 


62 OPERATORS §7 
rather by whose operators) the function ¥ is characterized, is of funda- 
mental importance for a deeper undcrstanding and further development 
of wave-mechanical theory. We shall not stress this further here, but 
shall limit ourselves to the following remarks. 

(1) In the case of a one-dimensional motion the Schrédinger wave 
functions are completely determined by one operator only, namely, the 
energy operator H. This means that the energy is the only independent 
constant of the motion, i.e. that any other operator F commuting with 
H represents simply a function of H. A function of this kind can be 
defined by the fact that its characteristic values are a definite function 
of the characteristic values of H. If, for instance, HY = Wy, then 

H*s = A(Hy) = HW = WHY = WY, A) = Wy, 

and in general F(A)b = F(W)y, (49) 
a result which can be proved directly if F is represented by a power 
series in H with constant coefficients and which can be used as a defini- 
tion of F(H) in the general case. The wave functions describing the 
motion of a particle in three dimensions are completely determined not 
by the energy operator alone, but by three independent mutually com- 
muting operators which represent three constants of the motion—if one 
of them is the energy, or if they indirectly involve the energy, all the 
three commuting with the latter—such that their common characteristic 
functions are at the same time solutions of the Schrédinger equation 
Hy = Wy. 

(2) If the function % does not satisfy this equation, then it does not 
describe the motion, and the operator or operators by which it is defined 
(according to the equations Fy = const.) can be said to have specified 
values, but not constant values, i.e. values which are not permanent in time, 
Thus time appears as the correlate of energy—a fact which is obvious 
in view of the possibility of representing the energy not only by the 
Hamiltonian operator H, but also by the time derivative operator 


—p, = eS s , the general form of the Schrédinger equation (H+ p,)y = 0 
17 


merely expressing the equivalence of the two representations with 
respect to a certain set of functions. 


8. Probable Values of Physical Quantities and their Change with 
the Time 

In classical mechanics time enjoys a supreme role entirely different 

from all the other variables, being actually the only independent 

variablé. The main problem of mechanics is to determine how all the 


§8 PROBABLE VALUES OF PHYSICAL QUANTITIES 63 
other variables—in particular the coordinates—-change with the time. 
In wave mechanics the time scems, at first sight, to be reduced to 
a humbler role, since the spatial coordinates no longer depend on the 
time but are treated—so far as the wave-mechanical ‘equation of 
motion’ is concerned—as independent variables, that is, they appear 
on the same footing as the time itself. 

This equivalence between the spatial coordinates and the time is 
restricted, however, as we know, to the wave equation (H+ p,)$ = 0 
and does not extend to the boundary conditions under which it has to 
be solved nor to the interpretation of its solutions. Thus a function 
(x, y,z,t) which satisfies the preceding equation is interpreted as the 
measure of the probability of finding the particle under consideration 
in a volume-element dV = dxdydz at a definite instant of time, the 
probability in question being defined as equal or proportional to yb* dV. 
If time played the samc role as the coordinates, we should not be able 
to refer the probability to a definite instant of time but should instead 
refer it to an interval of time dt, and define it as proportional to py* dV dt. 
There is, however, actually no reason why we should not be able to 
refer the probability of location to a given instant of time—for the 
particle must be somewhere at any moment. The exceptional role of 
the time becomes particularly clear if we restrict ourselves to solutions 
of the Schridinger equation which vanish at infinite distance (they 
cannot vanish for t = +00 except in separate places!) in such a way 
as to ensure the convergence of the intcgral { f* dV extended over 
all space. Taking the time derivative of this integral and replacing 


Ayp*)/O6 by —div j, where j = Pie (¥*Vp—YVp*) is the probability 


current density, then, if the integration is first extended over a finite 
volume limited by a closed surface, we get 


5 | wav = — 43,48, (50) 


where J,, is the normal component of 7. When the surface S is removed 
to infinity the latter integral tends to zero (so long as ¥ is supposed to 
be quadratically integrable), so that in the limit we get 


[ yy dV = const., 


co 


which enables one to normalize ¥ to 1 by the condition 
[ yr av =1. (50a) 


64 OPERATORS § 8 
It should be remarked that this result holds for the motion of the 
particle not only in a constant field of force (this case has been con- 
sidered in § 17, Part I), but also in a variable field of force. 

_ Now if f f~* dV is constant, it is futile to consider the integral 
Sf yp* dVdt with a view to normalizing the function ¥ in such a way 
that the time would appear on the same footing as the coordinates. 
The Hamiltonian operator H, which, as we have seen, is intimately 
connected with the time, must therefore play an exceptional role in 
determining the permanence or non-permanence in time of different 
quantities connected with the motion. 

As has been shown before, this permanence is determined by the con- 
dition HF— FH = 0, where F is the operator representing the quantity 
in question. We are now going to generalize this result for quantities 
which are not constants of the motion, i.c. quantities for which the con- 
dition HF—FH == 01s not fulfilled. 

In classical mechanics such quantities can be determined as functions 
of the time. In wave mechanics such a determination is only possible 
for their probable values, as defined by 


F= | y* Fy dV, 


under the condition (50a) (which is fulfilled for a motion restricted to 
a finite region or represented by a wave packet). 
Differentiating F with regard to the time, and taking into account 


the equations (x + = 5) = 0, (x a = ii * = 0, we get 
Us 


GP APO! [ (ange) Fy) —ye Ftp) av. 
Now it can iar be cae that 
J (yy Fy av = | p*H(FY av. 
In fact, putting Fy = f,, ¥* = f,, and writing the operator H in the 


form l/h gz og 
ae 2m sa Ses) (& Tay dy® . =a) baa, 
we find 


| (f, Hf.—feHfi) dV = mie) | |elish—heh)+ 
+5 (ighhz +g lighhg, | dV 


= n(n =) | divtisay, 


§8 PROBABLE VALUES OF PHYSICAL QUANTITIES 65 
where fie = Ai Vie—fe VA: 
If the integral [Ata 


is convergent, then the integral { divf,,dV = f{ f,.,d@S must vanish 
when the integration is extended over all space (the surface S receding 
to infinity), so that we get 


[Allfav = | f.Hfav. (51) 


It should be mentioned that all operators having the property expressed 
by this equation are called ‘sclf-adjoint’. Strictly speaking, the self- 
adjointness of an operator H is expressed by the fact that the 
difference f, //f,—f, Hf, is equal to the divergence of some vector; this 
condition leads to (51) when combined with the condition 


| fif,dV = finite. (51a) 


The latter condition is certainly fulfilled for f, = Fy and f, = * 50 
long as (50a) is fulfilled. 
We thus can rewrite the above expression for d’/dt in the form 


Op =e | WHY) erry) av, 


dF __ 2a 
dt oh 
It follows from this formula that dF/dt = 0, which means that F is 
a constant of the motion, if HF =: FH. This agrees with the result 
found before. According to the general definition of the probable valuc 
of a quantity represented by some operator F’, we can define the right- 
hand side of (52) as the average value of the operator 


or 


t W*(H F—FH)p av. (52) 


2m PF —FH) = [H, F}. 


h 

dF a 
Therefore —— == | H,F', 

dt 
or as = [H, F), (52 a) 
if dF'/dt is regarded as an operator defined by equation (52 a) and satis- 
fying the condition iF d 

dt dt” 


3595.6 K 


66 OPERATORS §8 
In the derivation of (52a) we have tacitly assumed that F did not 
contain the time explicitly. If it does contain the time, then equation 
(52 a) must be replaced by 

dF oF 

dt at 
For example, let us put F = x. The time derivative of x as a quantity 
is equal to zero, since x is independent of ¢t. Regarding x, or rather 
dz/dt, as an operator, however, we have 


dx 
a [H,x] = —[zx, H], 


+[H, F}. (52h) 


: dx OH 
d 44 — = —. 53 
or according to (44c) a TM (53) 
° 1 » » ”» 
which. - .th H= 5, Pet Py tps) + U(x, y, 2), 
; dx 1 
gives qi are (53 a) 


This equation coincides superficially with the classical relation between 
velocity and momentum, considered as definite quantities. In wave 
mechanics, however, they are indefinite quantities represented by the 
operators dr/di and p = mdr/dt. Putting F = p,, we have 


Ps = [H,p,] = [U,p,] = —[p,, U] 


or, according to By a), 
dp, oH aU 


ce seseantenaet tes Sai ota K 
a ox Ox (58 b) 


Equations (53) and (53b), together with the corresponding equations 
for the y and z components, are formally identical with the classical 
equations of motion in the ‘canonical’ form (see preceding chapter, § 5). 
If the classical] quantity represented by the operator F is defined as a 
function of the time and of the (classical) variables x, p,; y, py; 2, Dz, 
we have 
dF oF oF dx oF dp,\ _ 6F 0H oF 0H oF . 
a at al aE * Ge. a) a Die: te te dx Op, ) eee) 
according to (53) and (53b). Comparing this with (52b) we see that 
the classical analogue of the quantum bracket expression [H, F} is the 
0H OF _ oH OF 
sum > (= 


, ap, ox. Ox 8p, 
pression. 


-) which is the classical Poisson bracket ex- 


§8 PROBABLE VALUES OF PHYSICAL QUANTITIES 67 


Equation (52a) looks very similar to equation (43) and the equations 
corresponding to the other two coordinates, namely, 


OF of _ of 
jg Posh = (fh = [Pf (54) 


the time t being related to the energy operator H in the same way as the 
coordinates x, y, z are related to the operators p,, p,, p, representing 
the components of momentum. This relationship seems very natural 
from the point of view of the relativity theory and seems to indicate 
that time and encrgy must be treated on the same footing as the spatial 
coordinates and the components of the momentum. The similarity 
between the relations dF /dt = [H, F'| and daf/éx = [p,,f] is, however, 
only apparent—for in the latter case f denotes a function or operator 
depending explicitly upon x, and 6/dx denotes partial differentiation 
with regard to x, while in the former case F is a function or operator 
which does not contain ¢ explicitly. [he time equivalent of equations 
(54) is easily seen to be 


Ff eft’ 54: 
a [pF |. (54 a) 
This equation follows immediately from the definition of the operator 
y= ee = Replacing O/'/dt in (52 b) by [p,, F], we get 
aT 
LF 
“7 +P) Fi (54 b) 


It should be noticed that the operator H+ p, does not vanish identically, 
us might appear from the equation (H+ ,) = 0, but only with respect 
to the functions defined by this equation and describing the gencral 
type of motion determined by the Hamiltonian H. The fact that there 
are actually two different operators H and —p, representing the same 
quantity, i.e. the energy, and equivalent to one another with respect 
to the wave functions describing the motion of the particle, suggests 
the possibility of restoring the symmetry between time and space which 
is required by the relativity theory by introducing certain operators 
G,, G,, G, which, though entirely different from p,, p,, p., would repre- 
sent the same thing as the latter, i.e. the components of the momentum. 
The operators G would have to be defined so as to be equivalent to 
the corresponding p with respect to the same wave functions as the 
operators H and —p,. If this were possible, we could replace the time 
in its exceptional role by any one of the three coordinates z, y, z, 
e.g. we could define the wave functions by an equation of the type 


68 OPERATORS §8 
(G,—p,)¢ = 0, and interpret yxb* dydzdt as the probability of finding 
the particle in the region specified by dy, dz, and dt for a definite value 
of its x-coordinate. We could further define the average or probable 
value of an operator by the formula F = {fj ~*Fy dydzdt as a definite 
function of z and obtain for its derivative with respect to z an expres- 
sion similar to (52) or (52b), i.e. 
=o, t Gn F] = [G.+P. Fh 

provided the operator @, were self-adjoint, in the same sense as H. 

This relativistic symmetry between space and time, as expressed by 
the equal eligibility of any one of the four quantities z, y, z, t, and the 
associated quantities G,, G,, G,, H to the presidential role which has 
hitherto been enjoyed only by ¢ and H, cannot, however, be attained 
if we retain the definition of the Hamiltonian operator 


ee ee 
H = = (pit pyt p+ U 


which has so far been used and which corresponds to pre-relativistic 
classical mechanics. This follows from the unsymmetrical way in 
which the operators p,, p,, p,, and p, are involved in the equation 
(H+ ppb = 0. 

It is possible, however, to modify the Schrédinger equation so as to 
secure the desired symmetry enabling one to formulate it in either of 
the four equivalent ways (@,—p,) = 0, (G,—p,)p = 0, (G,.—p,)y = 0, 
(7+-p,) = 0 in agreement with the relativity theory. This modifica- 
tion (due to Dirac) will be considered later (Chap. VI). 


9. The Variational Form of the Schrédinger Equation and its 
Application to the Perturbation Theory 
If the potential energy U does not involve the time explicitly, then 
the equation (H-+-p,)) has, as we know, particular solutions of the type 
Yb = P(x, y,z)e274h, where the ‘amplitude’ function °(z, y, z) satisfies 
the equation Hy® = Wf (which has been written before in the equi- 
valent form Ht = Wy). Multiplying it by 4* and integrating over the 
whole space, then if, as we shall assume in future, f $°*~°dV = 1, 


We G6 [ yorttye av = W. (55) 


This is just what we should expect, since, according to the general 
definition of probable (average) values, the integral 


i WHY? dV = f b*Hp dV = H 


§9 VARIATIONAL FORM OF SCHRODINGER’S EQUATION 69 
is the probable value W of the energy which is a constant of the motion. 
We shall now show that the function ¥°, which may be called the 
characteristic function of the operator H (the time factor being 
irrelevant so far as the equation // = Wy; is concerned), can be deter- 
mined from the variational principle 


8H =8 | Y* TTY dV = 0, (55 a) 
in conjunction with the normalization condition 


| wb* dV = 1. (55 b) 
We have in fact 
8H = | Sy*Hy dV +. [ poxztsy av, 


or, according to (51), ic. because of the self-adjointness of H and 
because of the convergence of the integral { p*Sp° dV, 


sii = | SY* Tf TV | Sy Hyo* AV. (56) 
Further, (55 b) gives 
J SY dV + | 5f%Y dV = 0. (56 a) 


So long as the function ° is looked for as a complex quantity, it is 
equivalent to two real functions. We could therefore consider ¥° and 
Y°* as two independent unknown functions, and treat their variations 
as arbitrary independent infinitesimal quantitics, were it not for the 
condition (56a). According to the Lagrange ‘method of multipliers’, 
this dependence can be removed by multiplying (56 a) by some constant 
factor C and subtracting the result from (56). ‘This gives 


[ 8poe(ye—Cy) dv + f sp(Hy*—Cy*) dV = 0, 
and since 5/°* and 64° can now be regarded as completely arbitrary, 
we must have Hyp == CY® and Hy* = Cy, 


Thus from (55a) and (55 b) we have obtained the Schrédinger equation 
for the function ° and its conjugate complex function. The energy W" 
appears in the variational method as the value of Lagrange’s multiplier 
associated with the function #°, and the Schrédinger equation appcars 
as the variational equation of Kuler and Lagrange corresponding to the 
‘conditional extremum’ of the integral J/ =- { ¥°*Hy° dV. This integral 
can be written in a somewhat different form—a form which contains 
only the first derivatives of the functions ¥° and %°* (as it must do if 


70 OPERATORS § 9 
the variational ce is of the second order). We have in fact 


Pea aa) 
and consequently 
[ yosHye av 
1 alse) 1 f div(y*Vy9) dV — f Vyo*Vyo av|+ | Uyryo dV’, 


~ Im Qa 


or, since the first integral in the square brackets vanishes, 
a h? Ox, /,0_1 77 ,f,0%,f,0 , r 
= or , 57 
Ht | (aca Vyp°-+Ud yo) a (57) 
Putting p = oN, we can rewrite this expression in the form 


H = | (S ver+uwer) ay, (57) 
where |py°|? is the scalar product of the vector ps® and the conjugate 


complex vector p*p* = — i Vp. If, in addition, we introduce the 


function S = = log %°, and so replace py® by £°VS, we get 
710 


Az i (5 IVS2-+0) ive av. (57b) 


The integrand of this expression looks exactly like the classical expres- 
sion for the total energy (S, being the Hamilton-Jacobi action function) 
multiplied by |$°|?. It is worthy of remark that Schrodinger first 
obtained his wave equation by applying the variation principle to the 
integral (57b), without fully realizing at that time (beginning of 1926) 
its physical meaning. 

The variational equation 5H =: 0 does not mean that the values of 
H = W obtained from it (with the condition j ~~ dV = 1) are 
minimum or maximum values compared with those corresponding to 
slightly varied functions ~°. In order to find out whether we actually 
have an extremum or only a stationary value, we must calculate the 


variation of J/ to the second approximation, i.e. to the second order of 
the small quantities 56° and 3y.°*. 
We thus get 


AH = | (y*-+5y™)H(ye+-54%) dV — | pont dV, 
Sys* Hy dy" + | Y* TTS AV + | Sys* 1180 dT’, 


§ 9 VARIATIONAL FORM OF SCHRODINGER’S EQUATION val 
On the other hand, we must have 


[ret spony (yey By) av — [ yorye av 
= | dpoeyo ay + { yorsyo dy + f syorsyodV = 0. 


Multiplying this equation by the value of W corresponding to the 
function ° and subtracting it from the first, we get, since ¥° and 
satisfy the equations Hy = Wi, Hp* — Wi, 


AH = { 5y* (17 — W)Sy° dV, (58) 


which can also be written in the form 
rn r E poy/2-4(U — 164 | av. (58a) 


This expression can be considered as the second variation of H, since 
it is a small quantity of the second order. Its sign is, in general, 
uncertain: it may be positive for some variations 5% and negative for 


others. The values H = W given by the variational principle 8H = 0 
must therefore be regarded as stationary and not as minimum or 
maximum values. The preceding results arc simplified if we assume (as 
we-are usually entitled to do when we are dealing with stationary 
stutes with no magnetic field present) that the wave function y° is real; 
we need hardly however, restate them in this simplified form. 

The variational principle provides us with a very simple and important 
method for obtaining approximate solutions of Schrédinger’s equation 
and determining the corresponding energy values—or rather for improv- 
ing such approximate solutions and energy values after they have been 
obtained by some other method.t Thus the variational method is useful 
in determining the motion due to a field of force which is slightly different 
from some simpler field of force for which the motion is supposed to be 
known. The solution of this question is one of the two main problems of the 
perturbation theory, the other problem being the determination of transi- 
tion probabilities which has already been considered briefly in Part I. 
We shall give a detailed treatment of the perturbation theory in a later 
chapter. At present we shall briefly indicate those of its results which 
can be obtained, in a straightforward way, by the variational method. 


t+ The method of reducing the solution of a differential equation of the type 
Hy = Wy? to a variational problem has been worked out by Lord Rayleigh and much 
later by W. Ritz in connexion with the problems of the vibration of elastic bodies, which 
are formally very similar to the problem of the motion of a particle in wave mechanics. 


12 OPERATORS §9 
Let us suppose that, somchow or other, we have obtained a function 
f(x, y,2;@) which we know to be capable of approximately representing 
one of the characteristic functions of the operator H provided the 
undetermined paramcter a, contained in it, is suitably chosen. Then 
this particular value of a can be determined from the equation 


oH (a) _ y2E\a), (a) 

ra) da 
where H(a) = { $°*(x,y,z;a)H9%(x,y,z;0) av, (59a) 
and E(a) = | p*d° dV, (59 b) 


in conjunction with the relation H(a) = W, which gives the corre- 
sponding valuc of the energy. If the function is normalized to 1 (accord- 
ing to E == 1) for every value of a, equation (59) can be replaced by 
0H (a)/da = 0. 

This method, which is often used in practice, can be generalized to 
include the case when the function ¢° contains many unknown para- 
meters a, @,..., @,, the closeness of the approximation in general 
increasing with the number r of these parameters. We come upon a 
particularly simple and interesting case of such an approximation in 
the perturbation theory of a degenerate motion, where we have, in 
the absence of the perturbation, a set of wave functions ¥(z, y,z), 
P(x, y,2Z),..., W&(x, y,z) representing different states of motion with the 
same energy W. Let us assume that the potential energy U has been 
replaced by U’, the difference U’'—U corresponding to a small per- 
turbing field of force (for example, an external electric field of force). 
The energy operator H = p?/2m-+U must then be replaced by the 
operator H’ = p?/2m+U’ = H+U’'—U, and the functions y?, ¥9,..., p° 
must be replaced by a sct of r functions p’, f’,..., y’ referring to r 
states of motion with nearly the same energy, i.e. belonging to r energy 
values W;, W;,..., W; which are slightly different from one another and 
from the approximate value W corresponding to the absence of per- 
turbing forces (the latter are, of course, supposed to be independent of 
the time). Now the functions ?’ can be represented approximately as 
linear combinations of the functions ¥? with unknown coefficients. 


Thus we may write . 
Wh = X auf, (60) 


the r coefficients @,,, d,,,..., 4, appearing in the expression of each 
function y?’ playing the role of the r parameters mentioned above. 


§9 VARIATIONAL FORM OF SCHRODINGER’S EQUATION 73 
Dropping the index 2’ and substituting the expression ~” = ¥ a, yY? in 
the integrals 


H’- [ yo*H'y dV and Wi’ = f pogo dv 


we get ons nr 
H! = > < II ,.,a% a), (60 a) 

a | 
i’ = >) > Eg Qe %, (60b) 

‘oot 2 

where : 

i= | PrXH yp aV, (60c) 
- =| prey dV. (60 d) 


The expressions (6Uc) are the matrix elements of the cnergy operator 
H’ of the ‘perturbed’ motion with regard to the characteristic functions 
describing the unperturbed types of motion associated with the same 
energy W. Since these functions need not be orthogonal, the expres- 
sions /,, may be different from zero for k 4 1. 

The variational principle 8//’ = 0, together with the condition 
E’ = 1, gives the following equations: 


aH’ ,0E' elt | yy 
dak ae’ da Sy” 
i.e. r 
> (Ay — WE ya, = (A = 1, 2,...,7), (61) 
i 
S (Hig W'Byat = 0. (61a) 
k-1 


The second group can be obtained from the first by a change to con- 

jugate complex quantitics in conjunction with the ‘Hermitian’ relations 

(Part I, § 17) He =H, and Ey= Et, ~ 

and therefore need not be considered separately. The compatibility 

condition for the r linear homogeneous equations (61) runs 
Hi,—W'Ey, Hi.— WE yp . . . Hie— W’E,, 
H..,— W' Ey Hin— W' Eos . . . Hi,.— W'E,, a 0. (61 b) 
Fs be _—W B Hy W’ Ba eo i, _w’ B,, 

This is an equation of the rth degree for W’; its roots Wj, W3,..., W, 

are the required (approximate) values of the energy. The coeflicients 


hy pry Lay ye0e5 Uppy 
3505.6 L 


74 OPERATORS §9 
corresponding to W’ = W;, according to (61), specify, by means of 
equation (60), that type of perturbed motion which has the energy W,.. 
We thus see that the r types of unperturbed motion which have the 
same energy W and which are described by the functions ¢°....,y% 
actually give rise, under the influence of the perturbation, to the same 
number of different types of motion, but these, in general, now have 
different energies Wj,...,V;. This phenomenon is denoted as the 
‘splitting up’ of a multiple energy-level, by the influence of perturbing 
forces, into a number of ‘sub-levels’. The Zeeman and Stark effects, i.e. 
the splitting of the spectrum lines under the influence of a magnetic or 
electric field, are examples of this. 

It should be mentioned that if the functions pf are orthogonal and 
normalized to 1, i.e. if #,, is equal to 0 tor & #£/ and to ! for k == l, 
equations (61) assume the form 


3S Hae =Wa, (k= 1,2,.47), (62) 
= 


and the compatibility equation for determining the energy values 
reduces to 


|;,—W’ Hi, Hi, 

, ee h Lad YT? | 
| Hy  Hy-W. . . Hy |g ‘sie 
| Hy Hi, . . . HOW’ 


Equations (60), (62), and (62 a) closely resemble equations (48), (48a), 
and (48b) derived in §7 for the determination of the characteristic 
values of an operator F which is a constant of a motion involving 
degeneracy. Actually they are identical, but this is slightly masked by 
a difference in notation. If we replace F by H’, reverse the role of the 
‘old’ and ‘new’ functions % and yj’, replacing the 4 by ~” and the y’ 
by #°, and in addition write Hj), instead of c, and W’ instead of c’, 
then equations (48), (48a), and (48b) assume the form of (60), (62), 
and (62a) respectively. This coincidence shows that the operators H 
and H’ must commute with one another, i.e. that, to the degree of 
approximation obtained by the perturbation theory sketched above, 
the perturbation energy H’—H is to be considered as a constant of the 
unperturbed motion specified by H. 

This perturbation theory can easily be improved and generalized in 
such a way as to become what is called a transformation theory, the 
primary object of which is to derive exactly the characteristic functions 
and values of a certain operator H’ from the characteristic functions and 


§9 VARIATIONAL FORM OF SCHRODINGER’S EQUATION 75 
values of some other operator //. The solution of this problem is given 
by the preceding equations if, in the first place, we drop the assumption 
that the original (amplitude) functions ¥°, ¥9,..., Y belong to the same 
energy-level, and if, in addition, we increase r to infinity, 80 as to use 
the complete set of functions and energy-levels belonging to the operator 
iT, Jquations (60) and (61) or (62), in conjunction with (61 b) or (62a) 
will then determine the complete set of functions and energy values 
characteristic of the operator //'. Further generalizations of this trans- 
formation theory involving operators different from the energy and 
variables different from the cvordinates will be examined later (Chap. 1V). 

1t should be mentioned here that the reduction of an equation of the 
form F's = Cy to a variational principle of the form 


oF _- 5 | pF y dV =0 


(with the condition f yxb* dV = 1) is possible not only when F is the 
energy operator J/, but in the case of all operators which are ‘self- 
adjoint’, i.e. fur which f, Ff,—f. Ff, = the divergence of some vector. 
Actually it is not necessary for the integral { ¥y* dV to converge. ‘The 
ouly assumption which it is necessary to make in order to obtain the 
differential equation Fy =: Cys from the variational equation &F = 0 
is that # <= f fb* dV should be constant (64 = 0). 


10. Orthogonality and Normalization of Characteristic Functions 
for Discrete and Continuous Spectra 

The characteristic functions %° obtained by the variation principle, 
under the condition { p%${* dV = const., or by the direct solution of 
the equation Hy° = Wy, can form both a discrete and a continuour 
set corresponding to a discrete or a continuous set of energy values W’. 
The energy values are therefore said to form u discrete or a continuous 
spectrum of the energy operator H. As we know from the general (is- 
cussion of § 15, Part I, and from the examples of the oscillator and 
the hydrogen atom, a discrete spectrum is associated with characteristic 
functions which—because of ‘total reflection’—vanish at infinity so 
rapidly that the integral f ~°%°* dV converges. This makes it possible 
to normalize them to 1 by means of the equation f °%* dV == 1. The 
characteristic functions corresponding to a continuous W-spectrum may 
also—although not necessarily—vanish at infinity, but not rapidly 
enough (because of the lack of total reflection) to ensure the convergence 
of the integral { $°°* dV, so that their normalization to I, or to any 
other finite value, is in this case impossible. 


76 OPERATORS- § 10 

This relationship between the convergence or non-convergence of the 
integral f f%J°* dV (which is a measure of the probability of finding 
the particle somewhere in the whole of space) and the discrete or con- 
tinuous character of the energy spectrum is intimately connected with 
the relationship between the charactcristic functions %° and ¥°, which 
are associated with or ‘belong to’ different values of the energy JV, 
and V,,,. 

If the equation Hiy°, = W,,%° which is satisfied by £° is multiplied 
by Yi* and subtracted from the equation Hy* = W,°* multiplied by 


wn ae mera 


(), we get - : 

Hewes Uh 11st — Yn ph, = (Wg —W, at 

Integrating over the whole space, and assuming the integrals f |9|? dV 
and f |; |? dV to be convergent, we get, because of the self-ajointness 


of the energy operator according to (51), 


(W,.—W,) | pry dV == 0, 


m 1 


and since IV, :~ W,, 
i pory dV == 0. (63) 
This is the ‘orthogonality property’ which has already been deduced 
for one-dimensional motion in § 17, Part I. As shown there, this pro- 
perty can still be retained even when the states are degenerate, i.e. 
when different functions %°, and ¥® belong to the same energy-level, 
provided these functions are suitably chosen as linear combinations of 
the original oncs (if the latter do not already satisfy the orthogonality 
condition). If the energy values corresponding to different functions 
are distinguished by different indices, irrespective of whether these 
values are actually different or identical, the orthogonality relation 
(63) and the normalization condition f p° y&* dV == 1 can be fused into a 
single equation 
pyr dV = $ 


mrn mn 


(63 a) 


where 6,,,, = lif m = n and 6,,, = 0 if m # n. 

It should be mentioned that the existence of degeneracy must be 
regarded not as a general rule, but rather as :n exceptional occurrence. 
It only arises in a few cases in which the particle is moving in an 
exceptionally simple field of force. . Nevertheless, the simple types of 
the potential-energy function U corresponding to these simple fields 
of force are of great practical importance. 

As shown in Part I when discussing examples of motion in three 
dimensions, the different characteristic functions are specified by the 
values of three quantum numbers 7, %3, 23, which, from the geometrical 


§ 10 ORTHOGONALITY AND NORMALIZATION 77 
point of view, give the number of nodal surfaces of the different kinds 
and which, from the dynamical point of view, specify the characteristic 
values of three operators I, F,, F3, representing three independent 
constants of the motion which is described by the corresponding charac- 
teristic function. The energy operator H can be defined as a certain 
function of the operators /, F,, F;, its characteristic values being 
equal to the same function of the characteristic values C),, Cr, Cn, 
of these three operators. The existence of such operators is connected 
with the existence of ‘separable coordinates’ ¢,, q2, Y3, these coordinates 
being such that each characteristic function of HJ can be represented as 
the product of three functions Ph 4.n,(%1)> Yninyn,(Qe)> Yn, n,n,(a) Satis- 
fying the equations 


F, Pnen, (Qn) Pe Ek edi AIe) (i = 1,2, 3). (64) 
Since ; ” m 
Yn, ns, n, (2s Y, z) one Pr nnAW1)Un, nn(J2)¥n,n,n,(I3) (64 a) 
he rome 
these become F, nn, ae cit) PO aus 
with ( rege ” yr . 
H(F,, F,,, Fs)p", Ney NN (¢ n,? Ch! Cha, n,n,? (G4 b) 


where HW'(C", C”, C”) is the same function of the numbers C’, C”, C” as 
IT is of the operators &, #,, Fy. 

In the approximate quasi-clussical determination of the function ¥ in 
the form e75!", where S is the action function of the Hamilton-Jacobi 
theory, the product relation (64a) corresponds to the additive relation 


Solx,y,2) = Sqr) 4-8" (qs) 8'"(Qs) (64 c) 
which serves to define the separable coordinates in the classical sense. 
The quantum numbers 2,, %, %3 we introduced by the condition that 
the periodicity moduli of S“(q,) must be integral multiples 2, of h. 
The energy W(C’, C”, C’”) can be written as a function of the quantum 
numbers in the form W,, ».,,. We have degeneracy when the encrgy 
actually depends on only two or one of these numbers, or upon their 
sum—as in the case of a hydrogen-like atom, where we may assuine that 
n, denotes the radial quantum number, x, = / the angular quantum 
number, and 3 = m the axial quantum number, J, being the operator 
M? and F, the operator M,, and hence 

Yn nyns(72) ee P,,,(9), tn, n,n,(73) as la 
It is always possible to arrange the triplets of numbers 7, 7, 3 in 
a single row and to specify the functions ~° and the energy-levels W 
by a single index n indicating the position of the corresponding triplet 
in the row. The indices n(¥°, IV,,) so obtained will, of course, have no 


78 OPERATORS § 10 
connexion with the quantum numbers. One can also use a kind of vector 
notation, writing n as an abbreviation for the three indices 7,, ny, 13. 
This is the notation used in § 17 of Part I, and we shall use it in future 
when dealing with states of motion belonging to a discrete spectrum. 

A continuous spectrum of the energy operator H arises when at 
least one of the three operators F,, corresponding to the separation 
coordinates, has a continuous spectrum of characteristic valucs, the 
spectra of the other two operators remaining discrete (although of 
course they may be continuous too). This case occurs with hydrogen- 
like atoms in the region of positive energy values, i.e. in the region 
corresponding to the non-periodic (hyperbolic) motions of the classical 
theory. The wave functions can still, in this case, be written in 
the form of a product (64a), the radial quantum number (n,) being 
replaced by a continuously variable parameter. We may take as this 
parameter the characteristic values C’ of the operator F, itself, or 
the values of the energy which it determines in conjunction with the 
quantized parameters C” and C”. It will be convenient to use for the 
characteristic functions belonging to a continuous energy spectrum a 
notation similar to that corresponding to the discrete case, replacing 
the quantum numbers as indices by the characteristic values of the 
operators F and writing C as an abbreviation for the triplet C’, C’, C”, 
so that the characteristic functions and energies are written y?, (x, y, z) 
and W?, respectively. If this abbreviation is not desired, it may be 
preferable to use a mixed notation involving continuously variable 
parameters as well as quantum numbers (e.g. the characteristic functions 
of the hydrogen-like atom can be written in the form y},-,,,, where the 
energy W stands for the continuously variable parameter C’). 

1t should be mentioned that a continuous spectrum corresponds to 
non-quantizable or partially quantizable motions that can be de- 
scribed quasi-classically, i.e. with an approximately determined action 
function S>, which is either single-valued, or has a many-valuedness of 
a kind restricted to one or two of the parts into which it is separated 
according to (64c). The wave functions ?, belonging to a continuous 
spectrum W, do not possess the orthogonality property which is 
characteristic of the functions ¥° belonging to the discrete spectrum, 
since, as we saw when deriving the orthogonality relation (63), this 
relation depends not only upon the self-adjointness of the operator //, 
but also on the convergence of the integrals f |°|?dV. These integrals 
converge for ¥° = 49 but do not converge for %° = y?.. 

The’ connexion between the lack of orthogonality and the continuous 


§ 10 ORTHOGONALITY AND NORMALIZATION 79 
character of the energy spectrum can be illustrated by the following 
argument. Let us suppose that ¥?. and y¥?. are two functions belonging 
to two different energy-levels W(, and W,. Since the latter form a 
continuous series, their difference can be made arbitrarily small. Now 
if the orthogonality relation (63) applies to the continuous case, then 
the integral f ¥’*4?. dV would jump discontinuously from zero to 
infinity as we go from nearly equal values of C, and C, (corresponding 
to nearly equal values of the energy) to the limiting case C, = C,. 

It should also be mentioned that—with the exception of a motion 
with one degree of freedom, i.e. specified by one coordinate only—the 
continuous spectrum possesses a degeneracy of an infinitely high degree, 
in the sense that each energy value can be associated with an infinite 
number of different states of motion, represented by different functions 
yf.. In the case of a continuous energy spectrum it is possible, and 
indeed is often necessary, to consider not merely exactly defined states 
of motion corresponding to perfectly definite values of the continuously 
variable parameters C’, but rather states of motion represented by a 
superposition of exactly defined states corresponding to a very small 
range AC of these parameters, i.c. by wave functions of the type 


[ do dC = dye, (65) 

AC 
where the integration is extended over the range AC. The wave func- 
tions obtained in this way obviously represent a generalization of those 
functions which have been used in Part I to represent ‘wave groups’ 
or ‘wave packets’. In defining these generalized ‘wave-packet’ func- 
tions, we must take into account the time factor in the expression 
bo = ,e-27Wollk, since the energy Wo is also a function of C. So long, 
however, as the region AC is very small, the function (65) can be 


written in the form dao = Phe enieeWeallh, (65 a) 


where C, denotes some arbitrarily chosen ‘point’ contained in AC, and 
¢%¢ is a certain function not only of the coordinates, but also of the 
time, representing the propagation of the wave packet. 

For various reasons, it is usually more convenient to consider the 
functions ¢9, at a particular instant ¢ = 0, in which case they can be 


defined by the integral $0 = f yo, dC, (65 b) 
ac 


and to represent the inexactly defined states of motion for any time by 
the product of (65b) by e~7¥o,4h. 


80 OPERATORS § 10 

Let us imagine that the whole region formed by the variable para- 
meters C (it may be a ‘line’, a ‘surface’, or a ‘space’—depending upon 
the number of continuously variable paramcters in the triplet denoted 
by C) is divided into very small elements AC,, AC,,..., AC,, which do 
not overlap, and let us consider instead of the exact states the in- 
accurately determined states which are represented by the amplitude 
eanetiGOe: ) P?. dC (n = 1, 2,3,...). These states can be associated with 

Cn 


a discrete set of energy values W,, referring to certain (arbitrarily 
chosen) points of the corresponding elementary regions AC,,. 

It can be shown that in the limiting case when the size of each region 
is decreased to zero (their number increasing to infinity) the functions 


] 
jo = ip. dC (66) 
n AC, | Le 
v( ) J. 


behave in the same way as the ordinary amplitude functions belonging 
to a discrete spectrum, i.c. in such a way that the integrals { p'* po dV 
are convergent. This result follows from the oscillatory character of 
the functions ?, at large distances (sce below). Since the functions 
(66) satisfy in the limit the same equation as the corresponding exact 
functions (for W = IW,.), it follows that they must be mutually ortho- 
gonal and further that they can be normalized to 1, so that we can put 
[ BED dV = 8m. (66a) 
Let us consider, for example, the functions 
yp. —_ A (k)e!2rkz, 
which describe a force-frec one-dimensional motion with a momentum 
g == hk and a kinetic energy W = k?h?/2m. 
If we regard A as a slowly varying function of k, we get 
kyty bk ky i JOk . 
go [ dk = A(hy) fe dk = ACL, jolereetin Ake, 
ky- jk ky Ok tas 
We thus obtain, replacing the volume integration by an integration 


along the z-axis, 
+20 l 12 - in 7 Ak2\2 
0(2: Jn — —. 0);2 = * V2 4} A ret we 
[iw dx = 3 fis dx =: |A(k,)|lim Al | az( Abe 
4+” 


= laceyire | M2") ae = Ae 


i.e. by (66a), |A(&,)|? = 1. 


§ 10 ORTHOGONALITY AND NORMALIZATION 81 
It should be noticed that the normalizing condition only determines 
the modulus of the coefficient A(k). We can still multiply it by an 
arbitrary factor of the form e/, 

Likewise we find for two intervals Ak, and Ak, about the different 
mean values k, and k,: 

$0* do = A*A, ci2mky-kiz sin 7 Ak, x sin 7 Ak, x 
Wx Ta 


If, for simplicity, we put Ak, = Ak, (k, £4',), then the integral 
1 
| $)* 6° dx assumes the form 


Ak -{ 90 
| cattene (Aine) de @= a Kex). 


When Ak > 0 the quantity (k,—k,)/Ak becomes infinite and therefore 
this integral must in the limit be zero. These results can easily be 
generalized so as to apply to free motion in three dimensions, repre- 
sented by a wave function of the form 
Ye —_ A (k)e!7*'F —. A(k,, ki, be, )etarthee thy thes), 

since this function is equal to the product of three functions repre- 
senting one-dimensional motions parallel to the three coordinate axes 
respectively, the integrals both with respect to k,, k,, k, as well as with 
respect to 2, y, z thus reducing to products of integrals for the separate 
components. (It should be remarked that AC must be defined in thi. 
case as the product Ak, Ak, Ak,,.) 

The general proof of the quadratic integrability of the functions 
(66) can be derived from a very simple physical consideration, namely, 
from the fact that, at very large distances, the motion represented by 
any function 4, must approximate to a force-free motion, at least in 
all problems of practical interest for which the field of force determining 
the motion of the particle is supposed to vanish at infinity. 

Taking again the function ¥2 = e'*7*7 as a typical representative of 
wave functions belonging to a continuous spectrum (for the case of one- 
dimensional motion), let us consider the double integral 

J = [f Ut yp, dedk, = [ff e?rh-bee dedk,, 
extended from —oo to +00 both with regard to k, and x. Since each 
of the simple integrals over k, and over x taken separately between 
these limits does not have a definite value, let us define the value of 
J as the limit of J, = { de 7 { tM eitm(ly—ke dk, for k -> , or the limit 


“a 
3595.6 M 


82 OPERATORS $10 


+0 + 
of J; = f dk, fi itnta-h dx for £-> oo. In the former case we have 


kydk 
"( ettmlka—kie dk, = sin wha 
mx 
k,“ak 
[sin 7k 1 [ sinp 
mx 7 p 


independently of k, and therefore in particular for k = 00, which gives 
J = 1. In the latter case we get similarly 


f tonite dx = * 2n(ka— ky )E 
a m(k,—ky) 
f sin 2n(k,—k,)é 1 ft si 
,__ f[ Sin2n 1 2m Me ioe t 
oe ey ee oe 


independently of ¢, and in particular for = oo. The two definitions of 
J thus lead to the same result, namely, J = 

Let us now assume that y, = A(k)e*****, where A(k) is some relatively 
slowly varying (non-oscillatory) function of k, and let us define the 
double integral gas ae 


[fbb ta, dhe de 


4.20 +£ 

as the limit of J; = i dk, | dt y,, de 
—o -£ 

for € = oo. Then since 


+o 
Jie J A* (key) A (ep) an 2 he AE ap, 


a(k,—ky) 
= 1 4*(k,) f A( b+ sea)? dp 
we get J = A*(k,)A(k,) = |A(A,)|?. 


Hence it follows that the ‘normalization’ |A(k,)|? = 1 which has been 

derived above for the function ~2 = A(k)e*7** with the help of (66) 

and (66a) (with n = m = k) can be obtained just as well from the 
+0 +00 

condition | | Yet YR dk,dz = 1. This result can easily be generalized 


for any functions 7, belonging to a continuous energy spectrum, the 


§ 10 ORTHOGONALITY AND NORMALIZATION 83 
normalization condition of the usual type for the quasi-discrete functions 


| 
_— im ade 0 
Vn em KC) | roe, 
(AC,) 


namely, ir po* dV = 1, 
being equivalent to the condition 
[{ et ve, dc,av = 1. (67) 


The latter is similar to the equation 
d [very dv =1 


for functions belonging to a discrete spectrum. This cquation is an 
immediate consequence of the normalization and orthogonality relations 


| Potdn AV = Syn: 
It is possible to treat equation (67) in a similar way, i.c. to consider it as 


a coroilary following from an orthogonality and normalization relation 
for the functions ¥?, which, according to Dirac, can be written in 


eens [vet ye, aF = 8(0,-0, (678) 


where 8(C’) denotes a somewhat unusual type of function, rather defined 
by the left side of this equation (together with the condition (67) ) than 
defining it. As a matter of fact, this function does not depend upon 
the particular type of the function #7. so long as ¥f. satisfies the con- 
dition (67) which reduces to 


[ 3(C,—@,) a0, = 1, 
or J §(C) dC = 1, (67 b) 


the integration being extended over all values of the continuously 
variable parameter (or parameters) C. 

Tt is obvious that for C = 0 (i.e. C, = C;), the function 8(C) becomes 
infinite. It seems, however, impossible to assign to it a definite value 
for C40. Take, for example, the normalized function #2 = e7*« 
(with C = k). According to the definition (67a), we have 

4-0 
8(ky—k) = | eidmks—kde dy 
4-00 
ie. 8(k) = | eltmte de. (68) 


—0 


84 OPERATORS § 10 
This expression has no definite value. We can, however, replace it, as 
we have actually done above in the evaluation of the integral J, by 
4€ 

5:(k) = | etenkz dy, (68 a) 
and pass to the limit € - 00 after the completion of all the calculations in 
which the function 8,(k) enters, and in particular after integration over 
k: (which always forms a part of these calculations). The result will 
have a perfectly definite value, and indeed the same value as that which 
would be obtained by putting from the very beyinning 

(hk) = 0 for k +40 

° 68b 

and | d(k) dk = 1 MBB) 


in | wm” 
The above calculation of the integral J --: | i Yt dk, dx for afunce- 
6 oe 
tion of the type yi. = A(k)e#7**, subject to the normalizing condition 
J == 1, serves to illustrate these relations. 

We may thus say that the functions 2. belonging to a continuous 
spectrum, though not orthogonal to one another in the strict sense of 
the term, can he treated as af they were orthogonal to one another and 
can be normalized according to the conditions (67a) and (67b) with 
5(C) == 0 for C + 0. 

The usual normalization f ¢9 Y%* dV :-= 1 for a function belonging to 
a discrete spectrum is equivalent to putting the total probability of 
finding the particle under consideration somewhcre in the whole of 
space equal to 1. The normalization (67) or (67a) can be interpreted 
as expressing the fact that the relative probability of finding the 
particle within a finite region of space containing the field of force 
in which it is moving is infinitely small compared with the pro- 
bability of finding it at infinity (where it moves practically as a free 
particle). Under these circumstances it is more convenicnt to normalize 
the total probability to infinity rather than to unity. This normalizing 
to infinity, corresponding to the relation (67) or (67a), is equivalent 
to the usual type of normalization for the quasi-discrete functions 

1 


(AC) 
AC 
packet. 


| ¢, dC, each of which represents a kind of ‘frozen’ wave 


Ii 

MATRICES 

11. Matrix Representation of Physical Quantities and Matrix 
Form of the Equations of Motion 

If a particle is moving in a constant ficld of force, defined by a potential 
energy U(x,y,z) which does not. depend upon the time, its total energy 
W remains constant. A ‘conservative motion’ of this kind is described, 
in wave mechanics, by a particular solution of the equation (H+ p,)b = 0 
of the type p = P(x, y,z)e-27F"", where the amplitude function ¥° and 
the associated energy constant satisfy the equation Hy -- Wy. If the 
particular solutions of the equation (//-+p,)y == 0, where the Hamil- 
tonian Jf does not contain the time explicitly, form a diserete set 
corresponding to a discrete spectrum of Jf’, then the general solution 
can be represented as a sum of these particular solutions with arbitrary 
constant cocflicients. Thus we may write 


ip = > a, bn oa > ay oi, e Sart, (69) 
n n 


the functions Y° being supposed to be so normalized that they satisfy 
the condition f |°|? dV = 1. 

If the functions % form a continuous set, the summation must be 
replaced by an integration giving 


b= | ACoA == J deft. e-27Votlhk AC, (69a) 


where C represents the continuously variable parameters. If some of 
the three parameters are quantized while the others are continuously 
variable, the summation must be replaced by a combined summation 
and integration. Thus, for example, we may have 
“ a Y 
y= >> | i ocg Ws soe MOO, (69b) 


the functions #9, or ¥?.,,,,,, being so normalized that. they satisfy the 
condition (67), and a(C) -= aq being arbitrary functions of the con- 
tinuously variable parameters C. 

Tf—as is gencrally the case—the energy spectrum consists of a dis- 
crete part W,, and a continuous part W,, the general solution of the 
equation (H-+-p,) = 0 is represented by a sum of (69) and (69a) or 


(690), 0 that P= Dadar [dove d, (69¢) 
or $= TVV4nan Pann tr 2D [ @cynnsPomn Fr (694) 


86 MATRICES §11 
We shall first examine the simplest case, i.e. the representation (69) 
corresponding to a discrete spectrum. As already explained in Part I, 
§ 17, the summation, from the point of view of the probability theory, 
expresses the alternative character of the motions represented by the 
different functions y, or £®. The resulting function y% can be normalized 
to unity in the same way as the separate functions y¥,,, i.e. it can be 
made to satisfy the condition 


| Wp* dV = 1. (70) 
According to (69), in conjunction with the orthogonality and normalizing 
relations f $*% 4, dV = 5,,,, it then follows that 

\ @.a% = 1. (70a) 


The quantities a, a* = |a,,|* can be interpreted, subject to this condi- 
tion, as the probabilities of finding the particle in a state of motion 
specified by the function y,,, irrespective of its position in space. 

The probable (or average) value of any quantity represented by an 


operator F is determined by the gencral formula 
F = i pry dV. 
Putting J = >a, ¥,, we get 


F =F ah, Fans (71) 
where r= J ye Fy, av. (71a) 


The F,,,, are the ‘matrix elements’ of the quantity F with respect to 
the states of motion y,, and %,. Putting 


Pin = W(X, y, ze eanlh = Pernt (vy, = W,/h), 


we get Fan = Finn eftrrnnt, (71 b) 
with Fo = | yo Fy? dV (71 c) 
and Yan = Yaa = —s 


(cf. Part I, §§ 17 and 18). 

So long as the operator F represents a real quantity, the matrix 
elements F,,,, a8 well as their amplitudes, are Hermitian, i.e. they 
satisfy the relations 

Fan = | pa Fin = Pin (72) 


These relations are directly evident if F is a (real) function of the 


§ 11 MATRIX REPRESENTATION OF PHYSICAL QUANTITIES 87 


coordinates alone. To establish them for the general case, let us first 


put F = p, = eS = We then have 


Fam = [ WPatn V = 9” [iA dn dV, 
and consequently 


Ft = = | dant or, dV = — = =f baw dv” 
Now fdacvn av = [ 2iadayav — [ono av, 


and since the first integral on the right vanishes, it follows that 
h 7) . 
Fin = oni | Mage dV = Faw 


and so we get (72). The proof can easily be extended to any function 
F of the operators p,, p,, p, (and of the coordinates) not involving 
complex quantities (with the exception of the 7 in the expressions for 
~, which is necessary to make these operators correspond to real 
quantities). 

The relations (72) should not be confused with the self-adjointness 
relation (51) which, in the case of the integral (71a), runs 


[YR PY, av = |p, Fys dV. (72a) 
It is equivalent to (72) only when 
F=F*, (72b) 


i.e. when F is a function of the coordinates alone, not involving the 
operators p,, ~,, p, or involving them in even powers only. In the latter 
case, which is met ee for example, when F is the energy operator 


= (pit pit p;2)/(2m)+ U(x, y,2), 
the Hermitian ote (72) actually reduce to the relation (72a) 
expressing the self-adjoint character of F. Putting F = H, we have, 
since Hy, = W,y,, 
Hn = Wy | ohn aV. 


Taking into account the orthogonality and normalizing relations for 
the functions ¥,, this reduces to 
|; es » be 


mn 


= W, San (73) 


We thus get by (71) 
H = xa,a ‘WW, = > |a,, |70V,,. (73 a) 


88 MATRICES § 11 
This equation shows that if H is to be interpreted as the probable 
value of the energy, then the number |a,,|? must actually be considered 
as the probability of finding the particle in the state of motion repre- 
sented by the function y,, and associated with the exactly known value 
of the energy V’,. 

Similar results hold for any operator F' which represents a constant 
of the motion, i.c. which commutes with the energy operator. If 
there is no degeneracy, i.e. if the values of the energy W corresponding 
to different functions y, are all different, then, as already shown in 
§ 7, it follows from the relation HF = FH that Fy, = F,y,,, where 
F, is a constant, namely, the value of the quantity represented by F 
for the state in question. We thus get, in the same way as before, 


Fan = Santi 
and F = > |a,|2F,. 
These relations can still be retained when there is degeneracy provided 
the functions y,, yY,..., Y, forming a degencrate set, i.c. belonging to 
the same value of the energy, are so defined that they satisfy the rcla- 


tions Fy, = Fy, (this can always be done, as already shown in § 7). 
If they do not satisfy these relations, we have 


PY, = 2 Cty 
=1 
[cf. eq. (47b), § 7]. Multiplying this equation by ¥*,, where y,, is some 


m 


function of the same degenerate set, and integrating, we get 
[oi Py dV = 3 Cu] oi aV = Cony 


since we can always suppose the functions ¥,, to be orthogonal to one 


another, irrespective of the degeneracy. We thus get C,,,, == F,,, or 
r 
Fh, = & te py. (74) 


If ys, is some function not belonging to the degenerate set y,, Yo...., o,, 
it follows that 


Fue = [UR Fi aV = > Me | oa aV = 0. 
The general expression (71) thus reduces to the sum of the expressions 
y Sata, Fy, => > ata, FP Tha 
> PT kt »? p> Oj Oy 2 (74a) 


taken for different values of the energy W. The relation F,, = F?, 
follows from W, = W,;. Thus, irrespective of the degeneracy, the 


§11 MATRIX REPRESENTATION OF PHYSICAL QUANTITIES 89 
probable value of the operator F representing a constant of the motion 
is independent of the time. This independence of F of the time is there- 
fore the general criterion of the fact that F is a constant of the motion 
and commutes with H. If there is no degeneracy, it means that all the 
matrix elements of F must vanish with the exception of the ‘diagonal’ 
elements (i.e. those with two identical indices). In the presence of 
degeneracy this restriction is too narrow, the constancy of F being 
consistent with non-vanishing values of the matrix clements of F for 
all those states for which the energy difference vanishes. 
The relation (74) is a particular case of the general equation 


Fo, re p32 Fix ty (75) 


where the summation is extended over all the characteristic functions 
of H, irrespective of whether they belong to the same energy or not. 
This relation (75) holds for any operator F, and reduces to (74) when 
F is a constant of the motion. Equation (75) is derived in the same 
way as (74) by assuming that the function Fy, can be expanded in 
a series of the type p2 C4, with coefficients C,, which may be functions 


of the time but do not depend upon the coordinates.f This is equivalent 
to assuming that Fy? can be expanded in a series of the type > C?,? 
with constant coefficients Cy,;. In the latter case we obtain, by multi- 
plication by 4°* and integration over the coordinates, 


fo, Fyn dv = > Cy yerye dV = Ch,, 
l 


; ici 0 
1.€. Ts — “mk 


and Fy = ¥ Py. (75a) 


From this equation it is possible to derive (75) (provided F does not 
contain the operator p,) with the help of the relations ~2 = yp, e+!27! 
and F), = Fy, e~-*7a!, where vy, = v,—vy. 

If F is not a constant of the motion, the expression (71) for its 
probable value contains terms which represent harmonic oscillations 
with the ‘transition’ frequencies v,,, = (W,—W,)/h. (The meaning of 
this fact for the emission of light has been discussed in Part I, § 17.) 
Taking the derivative of # with respect to the time, we get, according 
to (71 b), aF 

dt 


+ This assumption can be justified for a very wide class of operators satisfying certain 
conditions which we shall not consider here and which are always fulfilled in practice. 
3505.6 N 


= > > an, a, 27m fe 
mn 


90 MATRICES § 11 


dF _ 2m +o (Wi 
or dt = h D200 i(W m WV, ices (75 b) 


It can easily be shown that the right side of this expression is equal 
to the probable value of [H, F], i.e. to 27i(H F—FH)/h. We have in fact 
FH, = FW,,,, = W, Fd, 

and, according to (75), 
HF, 2 p F,,, Ab, = > F,., Wy by, 
80 that 
(HF —FH), = { $3(HF—FH), aV 


= > Fin Wi, f $n, aV — W, { yp Fy, dV 
as Peal): 
We may thus define the operator dF/dt by the matrix equation 


dF Qn _d oe 
(a) =5 —— (Win Wilf ns = oz, (Fran): (75¢) 


If, in the preceding a we replace H by some other operator 
G, we get, by a twofold application of (75), 


(FG), =F p3 Grn Py = p3 Grn Fy, as 2 Gan 2 F, mk Ym 
on p (% Fux Ginbm- 
On the other hand, according to the same formula (75), we have 
(FG)f,, ae 2 (FQ) nn Wins 


where (F@),,, are the matrix elements of the compound operator F@. 
Therefore it follows that 

(FG) an = > Fy Gen: (76) 
If we put F,, = FS, e%m!, = Gi, = Gh, eft, 
and take into account the relation 


Won —Wy_, We—Wy _ Wn—Wr 
Yak TYE = sa —*4 —— h = Vnn> (76a) 

we get (FQ),,, = (FG), et27¥=!, with 
(FQ)s, jan ae Fok Gen: (76 b) 


This relation can be obtained directly by applying the operator FG to 
~° instead of ¥,, and using (75a) instead of (75). 
It should be noticed that equations (76) or (76b) coincide with 


§ 11 MATRIX REPRESENTATION OF PHYSICAL QUANTITIES 91 
equations of § 18, Part I, which were derived by combining the 
multiplication and addition laws for the ‘probability amplitudes’ for 
transitions from a certain state m to another state n through some 
intermediate state k. The matrix elements F,,, and G,, were inter- 
preted there as the ‘probability amplitudes’ for the simple transitions 
m — k and k > n under the influence of perturbing forces characterized 
by F and @ respectively, and the matrix element (F@),,,, a8 the 
probability amplitude of a transition which is a combination of the 
preceding two with the intermediate state k remaining unspccified. 

We shall return to this interpretation in a later section. 

Equations (76) or (76 b) express, from a purely formal point of view, 
the multiplication law of matrices. This matrix multiplication Jaw (i.e. 
combination of the rows of the first matrix with the columns of the second) 
is quite similar to the multiplication law of determinants, which can be 
associated with the corresponding matrices. Hence the matrix of the 
operator F'G is called the product of the matrices of F and G. 

Matrix multiplication is, in general, non-commutative, just like multi- 
plication (i.e. successive application) of the corresponding operators. 

It must be mentioned further that the products of two Hermitian 
matrices FG and (F are in general not Hermitian, the conjugate com- 
plex of (F'G),,,, being equal to (@F),,,.. The two products are therefore 
Hermitian matrices only if they are identical, i.c. if F and G commute 
with each other. 

If, instead of the product of two operators, we consider their sum 
F+G, which is obviously commutative in the sense that 

(F+G)p = (G+FYy, 
and form the matrix of this sum, we obtain the relation 
(F+G4) un = Fant Gan i (G-+ 3 ee (76c) 
which expresses the addition law of matrices, this matrix addition satisfy - 
ing the commutative law. 

It can easily be shown that, for three or more factors, the associative 
law is satisfied both for operators and for the corresponding matrices, 
just as for ordinary numbers, so that, for example, 

(EF)G = E(FG), 
and therefore 
[(EF)@] nn = p (EP ni Gen = p 2 Era ky Gen 


= J Bm F@)n = [BFE Vn: 


We thus see that there exists a one-to-one correspondence between different 


92 MATRICES § il 
operators and the associated matrices, both with respect to addition and 
multiplication. This correspondence enables us to replace the operator 
representation of physical quantities, which we introduced in the pre- 
ceding chapter, by a matrix representation, each physical quantity, 
whether numerically expressible, ie. having a definite value, or not, 
being represented by an array of matrix elements 


Ry Be Fo - «| 
Far Fey Fos 


. -{ 
Ev Rs Pex. s r (77) 
; : 
or 
fare Bye, Fys, “| 
In. Ph A. . | 
iS F3., F333» . i (77a) 
| 


These will be denoted in future by single letters F and F°® respectively, 
and will be used in exactly the same way as the operator representing 
the physical quantity in question, without direct reference to charac- 
teristic functions of any kind. 

It should, however, be kept in mind that such functions are indirectly 
implied in the very definition of the matrices F or F°, being the charac- 
teristic functions of the energy operator H. Referred to these particular 
functions, the energy is represonted by a diagonal matrix 


‘2. iae 


0 W, 0 
H=|0 0W,.. «i, (77b) 
| 
i.e. BF og = Omn Was 


where § = > % 


§1l MATRIX REPRESENTATION OF PHYSICAL QUANTITIES 93 


is the so-called ‘unit-matrix’, which in future will sometimes be denoted 
by 1 Owes — 1 nn): 

The matrix elements of (77b), i.e. the energy-levels W,,, appear in 
the relations FP. = FO, e822 ¥m—Waelh (77c) 


between the elements of (77) and (77a)-—the latter being simple 
numbers. The absolute values of the energy cannot, however, be derived 
from these relations, which contain their differences only. 

To distinguish the quantities F,,,, and F°,,,, we shall call the F,,,, the 
matrix components and the F°,,, the matrix elements of the quantity F’. 
For the energy as well as for any other constant of the motion, the 
matrix components coincide with the corresponding elements, so that 


we can then put F = F°, 


The representation of physical quantitics by means of operators 
(including functions of the coordinates alone) differs from the repre- 
sentation by means of matrices in that the representation by operators 
is absolute, while the representation by matrices is relative. By relative 
we mean that the matrix elements of a quantity are defined with 
respect to a particular set of stationary states which are specified by 
the characteristic functions of a particular operator—or a system of 
commutable operators (like H, Jf., and M*). We shall sce later that 
this distinction is not so fundamental as it seems. The operator repre- 
sentation given above is based upon the use of the coordinates (and 
the time) as the directly observable quantities. But this is not neces- 
sary. Certain other quantities—e.g. the momentum components—can 
assume the role of directly observable quantities. The coordinates then 
become represented as operators in terms of these new quantities. 
Leaving this aside, and retaining the variables x, y, z, ¢ as the primary 
and directly observed quantities, we can maintain the above distinction 
as a fundamental one. 

Now it can easily be shown that the determination of the matrix 
elements of any operator F with respect to the charactcristic functions 
of some other operator H (or of a system of three commutable operators) 
does not necessarily require an actual knowledge of these functions. It 
is in fact sufficient to know that they are such as to make the matrix 
of H diagonal. If, moreover, both H and F are explicitly defined as 
functions of the coordinates x, y, z and of the elementary operators 
Pz Py Pz then, taking into account the commutation relations 


h h h 
P,t—xp,= 5-1, Py¥—YPy= 1, peep, = 4-1, (78) 


94 MATRICES § 11 

P2Y—YP, = 9, ete., (78a) 

rY— Yt = 0, PrPy—PyPx = 0, etc., (78 b) 

(in the matrix representation) we can calculate, with the help of the 

matrix addition and multiplication laws together with the condition that 

X,Y, 2, Py, Py P, shall all be Hermitian matrices, the matrix elements 

both of H and‘ of any other non-diagonal matrix F. After the matrix 

elements of H and F have been determined, we can then calculate the 
matrix componenis of F (those of H coinciding with the elements). 

So far, therefore, as the determination of the matrix elements or 
components of any physical quantity with respect to the stationary 
states defined by some energy operator H is concerned, we can replace 
the solution of Schrédinger’s equation Hy® = Wy and the subsequent 
integration FY, = J po* Fy, dV by the following problem: 


(1) To determine the matrix elements of the quantities 2, y, =<, 
Pp Py Pz Subject to the commutation conditions (78), (78a), (78b), in 
such a way that the matrix of the function H(z, y,z;p,,p,,p-) shall be 
diagonal, i.e. that H,,, = 0 unless n = m. 

(2) Knowing the matrices 7, y, z, ),, Py, p-, to calculate the matrix 
elements (or components if the H-matrix is added to the list) of any given 
function F(x, y,2; Pz, Py, D:)- 

In this way the functions 9°, specifying the stationary states to 
which the matrix elements refer, can be completely eliminated from 
the matrix theory, and the latter built up as a closed and consistent 
theory, in the air, as it were, by the logical attraction of its elements, 
and not requiring the use of any ideas extraneous to it for its support. 

It should be noticed that the two parts of the above problem are, 
in a certain sense, reciprocal to one another—for in the first part 
we are concerned with the solution of a system of matrix equations 
for the unknown matrices 2, y, z, Pz, Py, P-, and in the second with 
the calculation of an explicitly given function of these fundamental 
matrices. 

In problems with one degree of freedom (corresponding to the motion 
of a particle in one dimension, such as the linear oscillator) the con- 
dition ‘H is a diagona] matrix’, together with the commutation condi- 
tions (78), etc., provides the basis for a complete and physically 
unambiguous determination of the fundamental matrices, e.g. z and 
p,, and consequently of the matrices representing, ‘from the point of 
view of H’ as it were, any other quantity F(z, p,). It should be noticed, 
however, that there remains a certain ambiguity which is irrelevant 


§ 11 MATRIX REPRESENTATION OF PHYSICAL QUANTITIES 95 
for the physica] interpretation of the matrix elements, but which, as 
we shall see later on, is very important for the correct understand- 
ing of the relation between matrix theory and classical mechanics. If, 
in fact, x?,,, and (p,)°,,, are matrix elements which satisfy the condi- 
tions of the problem (or rather of its first part), then any elements of 


the type ~ 


i —Q,) 4 j = 
mn e om), (pe a 


mn 


where «,,,..., «, are arbitrary real numbers, will also satisfy these con- 
ditions, the elements of any other matrix F°,, being replaced accord- 
ingly by F%,, e(%=-%”, This result can easily be proved directly, or 
deduced from the original definition of the matrix elements in terms of 
the characteristic functions ¢° if we use the fact that each of them can 
be replaced by its product by e-‘* without any violation of the ortho- 
gonality and normalizing relations. This amounts to the introduction 
of an arbitrary ‘phase’ into y,, (putting %,, = 9 e~#2@7!1o)) or “phase 
difference’ into F,,,, (putting f,,,, = F2,,, et@7¥malt+cn—an)), 

The ‘phase’ constants « vanish in the diagonal elements F°,, which, 
as we know, determine the average or probable value of the quantity 
represented by F in a stationary state with the energy W,,. The phase 
constants also vanish in the products F°,, F*, i.e. in the squares of 
the moduli of the matrix elements referring to different stationary 
states (W, ~ W,). These products determine the probability of a 
transition between the two states under the influence of a perturbation 
proportional to F. 

In the general case of motion in three dimensions, the condition that 
the energy matrix should be diagonal (together with the commutation 
relations (78), etc.) is not always sufficient for a physically unambiguous 
determination of the matrices x, y, z, p,, p,, p,, and it has then to be 
supplemented by a similar condition for one or two other matrices 
representing quantities which are constants of the motion, for instance, 
the z-component and the square of the angular momentum for motion 
in a central field of force. Such additional conditions are necessary in 
the case of degeneracy, the existence of which is revealed in the matrix 
theory, by the identity of several (diagonal) elements of the energy 
matrix. The matrices representing constants of the motion must of 
course—irrespective of the presence or absence of degeneracy—com- 
mute with the energy matrix, i.e. satisfy the relation 


(HP )nn = (FA) mn 
which corresponds to the operator relation HF = FH. The multiplica- 


96 MATRICES §11 
tion law (76), together with the condition that H is a diagonal] matrix 
(Ann = WrOmn), give 

(HP) nn = p2 Bink Fen = Wr Ena: 


(FH) nn rs & Fink Hyg = W, Finn: 


The condition that F is a constant of the motion therefore reduces to 
laa = 0, 
which means that Fi es 


i.e. that the matrix elements of F vanish for all states except those 
which correspond to the same value of the energy. Therefore, if there 
is no degeneracy, the constants of the motion must be represented by 
diagonal matrices. If there is degeneracy they may but need not 
necessarily have a diagonal form. 

The preceding result has already been obtained in a somewnat 
different manner [cf. (77d)]. It should be remarked that a function 
J(F) of a diagonal matrix is itself a diagonal matrix, the elements of 
which are equal to the same function of the corresponding elements 
of the argument matrix 

[f(P yan = f(Fan): 
This follows from the fact that the characteristic values of an operator 
f(¥) must be equal to the same function of the characteristic values 
of F. This result has already been stated when discussing the energy 
operator (§ 7). It can be obtained directly from the matrix multiplica- 
tion law which gives, when F is a diagonal matrix, 


(F?) an =~ > Far Fen = pa ea = YP 
al = p (F?) ae Fen = Fe. San etc., 


so that, if f(F) can be expanded in the form > a, F* where a, are 
: : ae k 

numerical coefficients, we have (> a, F ) = (5 ay, | 

As has been pointed out at the beginning of this section, matrices 
representing real physical quantities must satisfy the Hermitian con- 
Gition. The products of two such matrices F and G (unless they com- 
mute with each other) FG and GF cannot therefore represent a real 
physical quantity. Representation of real physical quantities can be 
obtained, however, by taking the sum of the two products, or their 
difference multiplied by +. In the first case we get, on dividing by 2, 
the ‘symmetrized’ representation }(FG+GF) of the classical product 


§ 11 MATRIX REPRESENTATION OF PHYSICAL QUANTITIES 97 
of the corresponding quantities. In the second case we get, with the 
additional factor 27/h, the bracket expression [F,@] which has been 
already considered in § 8 and which corresponds to the Poisson-bracket 
expression of the classical theory. 


12. The Correspondence between Matrix and Classical Me- 
chanics 

The matrix representation of physica] quantities was introduced by 
W. Heisenberg towards the end of 1925. A few months later Schro- 
dinger’s wave-mechanical theory appeared, but nevertheless Heisen- 
berg, Born, and Jordan continued, for some time during 1926, to 
develop their ‘matrix theory’, without secing any connexion between 
it and the ‘wave theory’. The connexion was finally discovered by 
Schrédinger (and independently by Pauli) who found that the Heisen- 
berg-Born-Jordan matrix elements could be calculated from the wave 
functions by means of the formula F°®,,, == J $o* FY? dV. This little bit 
of history serves to illustrate the fact that the matrix theory does not 
need a wave-mechanical support, but can be made completely ‘self- 
supporting’. We shall see later that the connexion between the wave 
theory and the matrix theory can actually be reversed in the sense that 
the matrix theory, in a generalized form due to Dirac and Jordan, 
contains the wave-mechanical theory as a particular case (§ 14). 

In his formulation of the matrix theory, Heisenberg was guided by 
Bohr’s ideas concerning the correspondence between the quantum and 
the classical description of the phenomena of radiation. In ‘the good 
old days’ before the coming of the quantum theory, atomic phenomena, 
and in particular those connected with the emission or absorption of 
radiation, were described in terms of a steady motion of the electrons. 
To this idea of steady (or continuous) motion, Bohr added the idea of 
transitions from one state of motion to another. In this way, between 
the years 1913 and 1925, physicists gradually became accustomed to 
considering two types of mechanical quantities—classical and quantum- 
mechanical. On the one hand we had, for example, the classical 
frequencies or amplitudes referring to the steady motion (analysed by 
means of a Fourier series into a sum of harmonic vibrations), while 
on the other hand we had the quantum frequencies or amplitudes 
referring to the transitions. 

By means of his ‘correspondence principle’, Bohr was able, in 1918, 
to establish an approximate relationship between the classical and the 


quantum-mechanical quantities. Advancing still further along the path 
3505.6 fe) 


98 MATRICES § 12 
laid down by Bohr, Heisenberg rejected the classical quantities alto- 
gether, as devoid of physica] meaning, and devised the matrix scheme 
(improved a little later by Born and Jordan) for the direct calculation 
of the quantum-mechanical quantities. 

The correspondence principle can be explained in the simplest way 
for a one-dimensional motion, restricted classically to a finite region, 
e.g. lying between 2’ and z”, and therefore periodic. The coordinate x 
of the particle can then be described classically as a periodic function 
of the time and expanded in a Fourier series of the form 


elt): = “S$ “O(k)etaneet, (79) 
k oO 


where v = 1/7 is the fundamental frequency of oscillation (7 is the 
period of oscillation, i.e. the duration of the ‘round trip’ from 2’ to z” 
and back again to z’), and x%k) is the amplitude of the kth harmonic 
term having a frequency kv. The two complex terms with the fre- 
quencies +/v and —kv must, of course, combine to form a rea] term 


of the type a, C08 27 |kivt +6,,, sin 27! | vt; 


it follows that the amplitudes 2+) and 2°(—k) must be conjugate 
complex quantities 2(—k) = 2%+k)*, (79.2) 
giving iy, = 2%(k)+2°(k)*, bigs == tx°(k)—2(h)*}. 

Bohr’s theory, in so far as it was concerned with steady motions, 


restricted these motions by quantum conditions which, in the present 
case, reduce to the single equation 


j= f g dx == nh, (80) 


specifying the quantized values of the energy W = W,, and hence deter- 
mining the fundamental frequencies vy = v,. Putting g == /{2m(W—U)}, 
and differentiating the integral 


J = $ \{2m(W—U)} dz 


with respect to W (considered as a parameter), we get 


qaeef  @  _ $ mdx _ {dx _ d at 

dw {2(W—U)/m} ¢ ys @ f 
i.e. aJ =F 
dW . 
aw 


This relation is a specia] case of the general] relations between the 


§ 12 MATRIX AND CLASSICAL MECHANICS 99 
energy, the fundamental frequencies »,, v,, v3, and the fundamental 
moduli of periodicity J,, J,, J; of the action function S which were 
deduced, in an earlier chapter, for motion in three dimensions, with 
the help of the theory of canonical transformations (Chap. T, § 5). 
Although the ‘classical’ frequency v given by (80a) refers to a steady 
motion, nevertheless it is expressed. as the ratio of the differences of 
W and J for two different, though closely neighbouring, motions as if 
it were associated with a transition between them. In fact the relation 
(80a) bears a striking resemblance to Bohr’s frequency condition 
= ee 
mn 0 h ’ 
which gives the quantum frequency associated with a transition between 
two more or less widely different ‘quantized’ states m and n. Intro- 
ducing the quantized values of the integral J, we can rewrite the 
preceding equation in the form 


= tinh rs = ae aol 
J,,—d,, AJ 

If W varies slowly with J, und if the quantum jump m—n is not too 
large compared with m or n, then the difference ratio AW’ /AJ can be 
replaced approximately by the differential coefficient dW/dJ. From 
(80a) we then get the following approximate relation between the 
classical and the quantum frequencies: 

Vinn Le (M-—-N)v. (80c) 
We may regard this relation as indicating an approximate coincidence 
or a ‘correspondence’ between the quantum frequency associated with 
a k-fold jump and the classical frequency of the harmonic oscillation 
of the order k (k = m—n). 

This correspondence between the classical and the quantum fre- 
quencies forms the nucleus of Bohr’s correspondence principle. The 
principle is extended by asserting that, in addition to this correspon- 
dence between the frequencies, there is also a correspondence between 
the amplitudes. 

Let us denote the functions z(t) for the nth stationary state by x, (/) 
and the expansion coefficients 2°(k) by x)(k). Formula (79) then 
becomes 


v 


v (80 b) 


mur 


all) = > a(kyertmn, (81) 
k =O 


Writing m—n instead of k and putting 
22(m—n) = typ, (81a) 


100 MATRICES § 12 
formula (81) becomes a 
xa(t) = S28, efteim—nvr, (81b) 
m= —o 


Now if the classical frequency (m—n)v corresponds to the quantum 
frequency v,,,, of the light emitted by the system under consideration 
(linear oscillator) as a result of the transition m—>n (if W,, > W,), 
then the classical amplitude z°,,, associated with this frequency must, 
according to Bohr, correspond to the quantum amplitude of the emitted 
light, the correspondence being such that the intensity of the emitted 
light must coincide approximately with the intensity calculated classi- 
cally on the assumption that the motion of the particle (which is 
supposed to possess an electric charge without which there would be 
no radiation) is represented by the simple harmonic term 
x 


Soe t2n(m—n)vt 
‘mn ~~ Xnn e ¢ ) 4 


The approximation with regard to intensity must be the closer the 
closer the approximation with regard to frequency. 

The ability of the correspondence principle to predict intensities has 
been verified in those cases where there is actually a close approxima- 
tion between the classical and quantum frequencies. For example, it 
was able to predict successfully the relative intensities of the neigh- 
bouring lines appearing in the Stark effect. Nevertheless the nature of 
the correspondence established by Bohr remained mysterious, until 
Heisenberg, towards the end of 1925, unveiled it in a way worthy of 
admiration both for its simplicity and for its boldness. Basing his theory 
upon the principle that only those things have a real existence which 
can be observed, Heisenberg put forward the idea that classical quan- 
tities do not exist at all, since they do not produce any directly observed 
optical effects. In fact the position and intensity of the observed 
spectrum lines can only be expressed in terms of quantum or transition 
quantities. 

From this point of view, the classical method of describing the motion 
of the particle by determining its coordinates for a given stationary 
state n as a certain function of the time z,,(¢), which could be expanded 
in a Fourier series (81 b), was to be considered as an approximation 
to the description of the motion by means of a double array or matrix 
components of the form 

Lean Ede e227 mnt 
‘corresponding’ to the totality of the classical harmonic terms for 
different values of m and n in the same sense in which an approxima- 
tion corresponds to the truth. 


§ 12 MATRIX AND CLASSICAL MECHANICS 101 

‘At this point two different possibilities for reforming classical 
mechanics seemed to be open. The one consisted in assuming that the 
motion of the particle in a stationary state n can be described as a 
definite function o the time, namely, by the series 


| 
x,,(t) = > a dee ermal, 
M-: -~ 


which should replace the simple Fourier series ($1 b), and that the 
equations of motion should be so modified as to lead to solutions of 
this new type instead of solutions of the type (1b). 

The second possibility was to assume that the classical description 
of motion, establishing a definite dependence of the position of the 
particle upon the time, had to be abandoned and replaced by a quantum 
description in which the coordinate x was to be determined as a matrix, 
made up of components of the type 2°,,,e!27¥™/, In this case the 
external form of the classical equations of motion could be maintained 
and only their physical meaning altered, the variables z, p,, H, etc., 
being regarded and determined not as ordinary guantitics but as 
matrices. 

With an unerring intuition Heisenberg chose the second way, thus 
giving up the very idea of motion in the classical sense (as being funda- 
mentally unobservable and therefore devoid of physical meaning) and 
laying the foundation of the new quantum or matrix mechanics. The 
idea that the quantum description of motion amounts to the deter- 
mination of quantities relating only to transitions between different 
states requires an important amendment, for besides such components 
a matrix contains diagonal components or elements relating to definite 
states taken separately. As we know, these diagonal elements are equal 
to the average or probable values of the quantity represented by the 
matrix for the corresponding states. This result, which has already 
been discussed in Chap. I, § 5, follows also from the preceding considera- 
tions connected with the correspondence principle. The time-average 
value of some quantity, e.g. x, as represented by a Fourier series (81), is 
obviously equal to that term of this series which does not depend upon 
the time, for which therefore k = 0. We thus have 


q(t) = 24(0), 
or, using the notation (81 a), 
q(t) = hn: 
Having defined every physical quantity as a matrix, Heisenberg 


102 MATRICES § 12 
naturally enough replaced the usual multiplication law for ordinary 
numbers by the matrix multiplication law. In this he was guided by 
the necessity of securing the form 
, a —_ Fr e277 mnt 

with the same transition frequencies v,,, for the matrix representing 
any function F(x) as those which appear in the matrix (82a) for the 
coordinate x. Taking, for instance, F(x) = x? and using the matrix 
multiplication law, we get 


2 _— a 0 { {  /.p2y0 i277Vnnf 
(x Vesti =" 2 Tink Ten pea (> Oak a, jetaeme mee (x Vast e' = 


as a consequence of the relations v,,, == (W,,—W,)/A, ven = (We —W,)/A, 
Yinn = (WL,—W,, )/h = vine t+vens ef. (76) and (76 b). 

Having introduced matrices to represent physical quantities and the 
matrix multiplication law for the calculation of matrices representing 
functions of such quantities, Heisenberg kept unaltered the form of the 
equation of the motion dx 

Oa S(x), 


understanding by x and f(x) not the usual variables but the corre- 
sponding matrices, and put Bohr’s quantum condition 4g dx = nh in 
the form h 


(9X—29) nn Ea Dari ’ 


leaving the question of the non-diagonal elements of the matrix open. 
The commutation condition 


gx—ag= . - 1 


201 
which also fixes the non-diagonal elements of this matrix (as equal to 
zero) was established by way of a generalization somewhat later by 
Born and Jordan, and still later was recognized (by Schridinger and 
Eckart) as giving the key for the transition from matrix mechanics to 
wave mechanics, this transition consisting essentially in considering x 


as an ordinary variable and g as the operator = ~ and further in 
i a: 


replacing matrix equations by operator equations with the wave func- 
tion to be operated upon. 

The information obtained from the wave-mechanical treatment of 
a problem is more complete than that obtained from the matrix- 
mechanical treatment, for in addition to the matrix elements we obtain, 
in the former case, the wave functions which serve to determine the 


§ 12 MATRIX AND CLASSICAL MECHANICS 103 
probable location of the particle, its probable velocity, and so on. In 
the matrix mechanics the notion of probability with reference to 
separate states appears only through the diagonal elements, represent- 
ing probable values, while the non-diagonal elements can be interpreted 
under certain conditions as the probability amplitudes for transitions 
between different states. In Heisenberg’s original theory, the matrix 
components of the coordinate were looked for as quantities which 
determine the intensity of radiation or, what amounts to the same 
thing, the probability of transitions with emission of light, it being 
assumed that the intensity of radiation associated with the matrix 
component 2,,,,, == 29,,, e/27¥™! is the same as it would be on the classical 
theory if z,,,, represented the actual motion of the particle as a harmonic 
function of the time. The result of this assumption is the same as that 
obtained in Part I in connexion with Schrédinger’s theory of radiation, 
namely, that the probability of a spontaneous transition m — n with 
emission of energy in the form of monochromatic light of the frequency 


Vinn 18 equal (per unit time) to 


‘3 eevee ae 
Ags = 3¢3 h e inn | ’ 
where e is the electrical charge of the particle [Part I, eq. (93)]. 

In the preceding sketch of the development of Heisenberg’s matrix 
theory from Bohr’s correspondence principle we did not attempt to give 
a direct proof of the latter so far as it refers to the connexion between 
the Fourier amplitudes and the matrix elements, having confined our- 
selves to the frequencies with respect to which the correspondence 
could be established by means of Bohr’s own theory. This gap can be 
filled with the help of wave mechanics, or rather that approximate form 
of it which has been discussed in Chap. J, § 5, and which corresponds 
to the classical mechanics together with Bohr’s quantum conditions. 

We have already used this approximate form of the theory for com- 
paring the classical time-averages (which are equal to the constant term 
in the Fourier expansion of the corresponding quantity F considered 
as a function of the time) with its probable values, defined by the 
integrals { ~* ys, dx, which are nothing else but the diagonal elements 
F.,, = F®,, of the matrix representing F. We have found that to the 
approximation implied by the formula (23a), § 4, 

c 


yo, = Ties] et2rsy(r, fh (82) 
on 


where v, is the velocity of the particle (defined by the equation 


104 MATRICES § 12 
v, = J{2(W—U)/m} as a function of its position x) and 

8, (x,t) = 83(z)—W,t 
the classical action function for the state in question (with the energy 


W,,), the classical time-average + { F(t) dt coincides with the probable 
T 
0 


value { Fus y, dx provided ¥, is normalized to unity, that is, the 


coefficients c, are set equal to ,/(2/7). 


a x” 
(J Wal? de = lenl? f d/o = IealJr = 1.) 
oe” ra 

In a similar way it is possible to ascertain the approximate equality 
between the Fourier coefficients in the expansion of x(t), or any function 
of x supposed to be determined as a function of ¢ according to the 
classical laws of motion, and the ‘corresponding’ matrix elements of 
this function F(z). 

In order to determine the Fourier coefficient x°(n) in the expansion 
(79) we multiply x(t) by e-!27"” and notice that the constant term in 
the resulting expansion is just x°(7). 

We thus get wea) = : | a(t) e~ f=" dt, 

, 0 
or, in the alternative notation corresponding to (81 b), 


es — 1 | =@ e-i2n(m—n)yvt de. 
es 
0 
The coordinate x can be replaced here, as just mentioned, by any 
function of x (or of x and g) giving 


roe! | F(t) e-i2nim—0wt dt, (82a) 
T 
0 


On the other hand, we have by the definition of the matrix elements 
Fe, = [ yor Pye de, 
: 2 


or, according to (82), with s(z,t) = s°(r)— Wt, y® = J: ox eftmen(zyih 


ae (82 b) 


(v, Um) 


z’ 
F°, n= 2 | F(z) et2nlan(z)—si (ah 
T 
2’ 


§ 12 MATRIX AND CLASSICAL MECHANICS 105 
Now if the states n and m differ but little with respect to their 
energy, we can replace ,/(v,, v,,) by a certain mean value of the velocity 
for an energy W lying between W,, and W,,, and put accordingly 
dx/./(v,,¥,,) = dt just as in the case n = m. We have further under the 
same condition 


ah) —ab,(x) = EY) oy 


(J, —Jn ) ’ 


where J is the action variable (80) are in Chap. IT, § 5, for the 
general case of a three-dimensional motion), and J,, = nh, J, = mh 
its quantized valucs. In the case here considered of a one-dimensional 
motion the function s°(x) can be readily determined, from the equation 
g = €8°(x)/0x defining it, by the formula 


8°(x) == | gdx = | J{2m(W—U)} dx, 
whence it follows [cf. the er of (80a)] that 
03°(x) _ m dx 
t. 

aw = J toiw vying JG = Hoo 
and consequently (dropping the irrelevant constant) 
(5 deo dW dW 

57). comt. OWdJ  dJ’° 
We thus get with the above iia 


8y(%)—8>,() = “, f(y Jin) 


or, since with the same ee (J,,—J,,)dW /dJ = W,,—W,, 
s8(z)—a,(2) = (W,—Wey)t. (82¢) 
This gives, on substitution in (82b), 


tr 
Fo = 2 | F(t) e-12™(Wn-Wath dt, 
7 
0 


which coincides with (82a) when we remember that 
(m—n)v = (W,,—W,,)/h. 

The preceding results can easily be extended to the general case of 
the motion of a particle with three degrees of freedom in a limited 
region of space. According to classical mechanics such a motion can 
be described under certain very general assumptions as a ‘conditionally 
periodic’ motion, which means that the coordinates, or any function F 


of the latter, can be represented as a function of the time by a triple 
3505-6 P 


106 MATRICES §12 
Fourier series with three different (incommensurable) fundamental 
frequencies ¥,, 12, v3: 

F, nn) = EE D PS, w,rn,, 1, 0,0, CMM a= abe Vata Hab, 


My Ms Ms 


the coefficients F® being determined by the formula 


7 
2 tim | F,,,,, (é)e~!2mom= ea tol al, 

According to wave mechanies, a series of this kind, as a whole, will 
have no (or at least no exact) significance; the totality of the harmonic 
terms in all such series, corresponding to all possible states ,, No, 23, 
will, however, constitute an approximate expression of the matrix 
representing the quantity F. The exact expression of its matrix 
components can be obtained if we replace the classical frequencies 
(m,-—Ny)Vy- (tg -- Ny)Vo-+ (Mg-—Ng)vg by the transition frequencies 
(Waa, mym,77 Hayne) and define the amplitudes Fi, iain.njn,n, by the 
integrals f yo mm, A Pn nn, @V. The approximate equivale nee of this 
definition to the classical one given above can be shown with the help 
of equations (32), (32a), and (32b) of § 5 in exactly the same way 
us before. 

One might be tempted to think that it would be possible to give a 
correct wave-mechanical definition of the quantity F’ as a function of the 
time by replacing the classical amplitudes and frequencies in the pre- 
ceding expression for F,, ,, ,,,(¢) by the quantum ones, i.e. by putting 

Bie n un, (t) = > z > eS. M,M,; N,N,N, e127 Fmymgms—nyngn Mh, 


The fact that no physical significance can be attached to this ‘modified’ 
Fourier series is, however, clearly illustrated by the possibility of 
multiplying the functions $f ,,,, by arbitrary phase factors e~!%», 
resulting in the multiplication of the matrix elements by the phase 
factors ¢e%(%mm.m,~%,»,»,), which are completely irrelevant from the point 
of view of the wave-mechanical or the matrix theory, but profoundly 
influence the ‘modified’ definition of the function F,, ,,_,, (¢). 


13. Application of the Matrix Method to Oscillatory and Rota- 
tional Motion 

The matrix mechanics of Heisenberg, Born, and Jordan can be con- 

sidered as a kind of ‘skeleton’ of Schrédinger’s wave mechanics, com- 

plete in itself but nevertheless deprived of the flesh and blood of 

the probability conception, which forms the vital element of wave 


§ 13 OSCILLATORY AND ROTATIONAL MOTION 107 
mechanics. In addition, the wave-mechanical theory has another ad- 
vantage over the matrix theory, for, as a rule, it is easier to solve 
Schridinger’s equation for the characteristic functions of the energy 
operator and then to use these functions to calculate the matrix ele- 
ments of any other operator by means of integration, than to determine 
these matrix elements from the condition that the matrix of the energy 
is diagonal], together with the commutation relations for the coordinates 
and momentum components, without knowing or using the ¢harac- 
teristic functions at all. 

The practical application of the matrix theory to concrete problems 
can, however, be made much easier and more convenient if instead of 
carrying out the matrix representation directly with respect to the 


fundamental] operator relations p,x—zp, = 1, etc., together with 
“77 


the condition that H(x, y,z;p,.P,,p,) is diagonal, it is carried « it with 
respect tc some other operator relations between certain more com- 
plicated functions F, G, etc., the choice of which depends upon the 
character of the problem |[i.e. on the potential-energy function U(x. y.z)] 
if at least some of these functions commute with the energy, i.e. re- 
present constants of the motion. If G is such a constant (it may, in 
particular, coincide with the energy H), and if some other function 
F (for instance, the coordinate z) has been found which satisfies a 
commutation relation of the form GF—FG = af+£8G where a and B 
are constant, the matrix interpretation leads very simply to the deter- 
mination of the matrix elements both of G, which can be assumed to 
be diagonal, and of #. Applying the matrix multiplication rule to the 
left side of the preceding equation, we get 
(GF—F@) ay sa (Gt O na) Pain = Fin t+PBGnn Sma 

whence it follows that all the matrix elements of F vanish with the 
exception of the diagonal elements which are equal to 


un 


Fin = =e 
a 


and those for which Gam—Gan = %. 


This equation leads very simply to the determination of the numbers 
G,,—especially when » can be treated as a simple quantum number 
(and not as a set of several quantum numbers 7,. 72, 13 all of them 
different from the numbers m,, m,, m, represented by m). By a suitable 
labelling of the states associated with given values of (7, we can make 
those states for which the values of G differ by « successive, i.e. having 


108 MATRICES § 13 
values of » and m differing by 1, so that the preceding equation will 
reduce to G43 n41—G@an = a The solution of this equation is obviously 
of the form G,,,, = an+y, where y is a certain constant. We shall not 
develop these general considerations but shall merely illustrate and 
amplify them by means of two special problems of outstanding sim- 
plicity and practical importance—namely, the problem of a linear 
harmonic oscillator and the problem of the rotational part of the motion 
of a particle in a central (radially symmetrical) field of force. 
The energy of a linear harmonic oscillator is expressed by the operator 
or matrix (as we please) 
ff == x p?-+ §(2ary)?mx?, . (83) 
where vy, is the natural vibration frequency of the classical theory. 
According to the matrix theory H has to be ‘diagonalized’ subject to 
the additional condition 
h. 
pE-Zp = y 1, / (83 a) 
1 being the unit matrix. 
We shall put, for the sake of brevity, 


2rvgmz = q, 2m] -- kK, hvgm == w, 
so that (83) and (83a) can be written in the form 
P+g=K, pq-qp = —tw, (83 b) 


it being understood that w denotes the product of the factor hyym and 
the unit matrix. 
We shall now introduce the matrices 


r=pt+iq and 8 = p—ig¢ (84) 
which are more convenient to deal with than p and g taken separately. 
Taking their product in the order 7s, we get 

rs = pp+iqp—ipgt4qq = p*+q°>—t(pq—qp), 
i.e, rs = K-—w. (84a) 
Similarly we get sr = K+w. (84 b) 
Hence, using the associative law, 
rer = (r3)r = (K—w)r, 
rsr = r(sr) = r(K+w), 
i.e. putting K—w = L, a (85) 
Now since K and w, and consequently L, are diagonal matrices, we have 
(Lr—tL) mn = (Linm— Lan) mn 


§ 13 OSCILLATORY AND ROTATIONAL MOTION 109 
and (TwW)mn = mn, Where w denotes now not the matrix but simply 
the number hvym, so that the preceding equation can be written in 
the form 

(Linm— Lnyn—2w)Tinn = 9. (85 a) 
Thus either 7,,,, = 0, or L,,,,—L,,, = 2w. In the same way we get 

srs == (K+w)s = 3(K—w) 
and (Lium— Linn t2w)8nn = 9, (85 b) 
so that either s,,, = 0 or L,,,,—L,, = —2w. Now 

Linm— Ln = Kirm—K yy = 2M(Aam—Han) = 2m(W,,—W,) 

is the difference of the energy-levels for the states m and n multiplied 
by 2m (m being the mass and not the label number of the state!). We 


thus see that the energy-levels must form an arithmetical progression 
with the difference 2w/2m = hyo, 80 that we can put 


W,, = nhv,+-const. (86) 
With this labelling of the stationary states we must have 


mn 


Tun = 0, unless m = n+1 


mn 


(86 a) 


8,,,, == 0, unless m = n—1 


mir 
The value of the constant in the expression for W,, can be obtained 
from the condition that the lowest value of L,,,, must be equal to zero. 


This condition follows from the equation 
(78). = > Tak 8kn = Tayn-18n-1,n = ag 


in conjunction with the fact that A,,,, cannot assume negative values 
because the matrix K represents an essentially positive or rather non- 
negative quantity, namely, 2m(p?-+-q?) (with p and q both real). Hence 
we conclude that the series of stationary states must terminate with 
some state n,,;, which we can obviously label as n = 0. The matrix 
elements r,,,,_, and 8,_,,, must obviously vanish for n < 0, since the 
states n < —1 do not exist, whence it follows that Ly) = 0, or Ko = o, 
and consequently FH, = W, == jhvo, that is, 

W, = hvg(n-+3) (86b) 
in agreement with the result obtained in Part I, § 13, by means of the 
wave-mechanical treatment of the problem of the linear oscillator. 

Further, for x > 0 we get 

Ty n-19a-1n = 2MhvgN. (87) 
Now from the definition of r and s according to (84) or 

Tan-1 = Pyrat tenis 


8n-1n = Pr-1n— 4@n-12: 


110 MATRICES § 13 
together with the Hermitian character of the matrices p and g (which 
expresses the reality of the quantities represented by them), it follows 


that P _ (87a) 


n-1n ~~ ‘n,n-1° 
We thus have Irnn—al = [8a-an| = J(2mhvgn). (87 b) 
Coming back from r and s to p and qg, we have p — 3(r-{8), 
gq = —$u(r—s), and consequently 
Pan-1 = 8 nav Pa-in = 38n-1,n a (88) 
Inn = 8 pn» In-1n = §8_-1.0 
all the other matrix elements p,,,, and q,,,, vanishing. 


7 
bids get Pana! = In,n—1! = I(§{mnhvy) (88 a) 


and, returning to the original coordinate, x == q/(27197), 


h ] 
g = neat — = - eee hl 88 I 
Inn -1 | al (soe: m n| Qn m IParn 1 | ( ») 


The latter relation between z and p can be obtained directly from the 
equation p == mdx/dt, which gives 

Prk = M2TW KE ks 
i.e. since v,, = (W,—W,)/h = (n—k)vo, 

Pran-1 = 29 Xn n-1 
The derivation of the formulae (88) and (88a) by the purely wave- 
mechanical method, i.e. through evaluation of the integrals 

--oo +@ 
Linn = | 7o% hb, dx and p,,, = Ma | p* oy dx, 
mn mrrn mim Dnt m dx n 

where y,, and y,, are the normalized characteristic functions of the 
harmonic oscillator, would require a much larger amount of more com- 
plicated calculation. 

In the case of the hydrogen-like atom, the wave-mechanical method, 
on the contrary, proves much more simple and convenient than the 
matrix method for the determination of the energy values and the 
matrix components. The matrix method can, however, be applied with 
advantage in this case, as well as in the general case of the motion 
of a particle in any central field of force, for the determination of 
quantities which wave-mechanically depend upon the angular part of 
the wave functions only [i.e. on the spherical harmonic functions 


¥,,(9, $)].- 
Here-belong in the first place the components of the angular momen- 


§ 13 OSCILLATORY AND ROTATIONAL MOTION 1k 
tum M,, M,, M,, or rather their matrix elements with regard to states 
differing from each other by the values of the axial quantum number 
m (or also of the angular quantum number /)—including, of course, 
their characteristic values. 

The purely matrix determination of these quantities can be obtained 
most simply if one starts from the commutation relation 


M x*M <2 woM, 
WT 


which has been deduced in the preceding chapter with the help of the 
operator definition of the vector M. 
We shall put, for the sake of brevity, 


Mas eA, M, at JE ae Mz ¢. 
7 2a = 27 


so that the commutation relation above referred to assumes the form 
AB—BA = iC, BC-—CB =- iA, CA—AC = iB, — (89) 
A, B, and ( being regarded here as matrices. 
We shall introduce the matrix 
N = A®+ B24? (89a) 
which (multiplied by 2/47?) represents the square of the total angular 
momentum (42), aid shall show that it commutes with each of the 
matrices A, B, C (the proof is the same as if they were treated as 
operators). 
We have, namely, 
CA?—A®C = (CA—AC)A+A(CA—AC) = +71(BA+AB), 
and similarly 
C B?— B°C = (CB-—BC)B+ B(CB-- BC) = —1(AB+BaA),. 
Adding these equations to the equation (C?—C?C = 0, we get 
CN—NC = 0, (89 b) 
and in the same way AN—NA = 0 and BN—NB = 0. 


Since, moreover, we know that N commutes with the energy matrix 
H, it must be a constant of the motion, and its characteristic values, 
together with the characteristic values of H, i.e. the diagonal elements 
of N and H in a matrix representation corresponding to characteristic 
functions of both H and JN, can be used to specify the stationary states. 
We know, furthermore, that these characteristic functions can be chosen 
in such a way [by putting ¥,,,(9,¢) = Pn(9)e'*] that one of the three 
matrices A, B, C—C say—shall also be diagonal (corresponding to 


112 MATRICES § 13 
Cy = const.). Using the results obtained before by the wave- 
mechanical method, we can thus define N and C as diagonal matrices 


with the elements Num: nian == Ul-+1) \. (89c) 


Caius aim — ™ 
These results can be obtained independently by the purely matrix 
method, if we confine ourselves to matrix elements corresponding to 
the same energy values and assume both N and C to be diagonal 
matrices (which we obviously can do for the sake of simplicity, although 
this is by no means necessary). 

We shall consider first such matrix elements of A and B as correspond 
to states with the same value of N and shall distinguish these states 
accordingly by one index m only, specifying the characteristic values 
(i.e. the diagonal elements) of C. 

As in the case of the oscillator, we shall not consider A and B 
separately but in the conjugate complex combinations 

A+iB = R, A--iB = 8. (90) 

Replacing the A of the oscillator theory by C, we have, according 
to (89), 

(A+2B)C—C(A+7B) = (AC—CA)+2(BC—CB) = —(1B+4), 
i.e. CR—RC = R, (90 a) 
and similarly CS—SC = —S. (90 b) 
These equations are of exactly the same form as equation (85) for r and 
the corresponding equation for s, the constant w being replaced by }. 
We thus get, in the same way as before, 


Cram = m+const., (91) 
the non-vanishing elements of R and S being 
Ramen SE Be 
and having the same numerical value since 
Roget = Bice (91a) 


The latter, together with the value of the constant in (91), can be 
derived from the equation 


RS = (A+iB)(A—iB) = A274 B°4+C = A*4 B24024]—(C?—C+4+}), 


i.e. RS = N+}-—(C—4)?. (92) 
Taking the diagonal elements of both sides, we get 
(RS) nm an Rinm-1 By -10 = N+3—(Cyum— 4)? (92 a) 


where N now denotes not the matrix N but the diagonal element of 


§ 13 OSCILLATORY AND ROTATIONAL MOTION 113 
this matrix corresponding to the state in question (with no subscript 
mm affixed to it because it does not depend upon m). In a similar way 


find 
iaciiae (SR) nm = mm Rasim — N+43—(Crim+ 3). (92 b) 


It should be remarked that the same expression can be written in the 
form (RS),,4:m41, 80 that we must have, according to (92a), 


(Cram+ 4)" = (Cn+14mti— 4) 

which is, of course, in agreement with (91). 

Now since A?+ B?+ C? == N, the characteristic values of the operator 
C or, what is the same thing, the diagonal elements of the matrix C 
must lie within certain limits, the maximum value C’ not exceeding 
+ N* and the minimum value C” being not smaller than — N?. Denoting 
the corresponding limiting values of m by m’ and m” respectively, we 
must have 


R 


m'+1,m’ — Ym',m'+1 = 0 and Rinem*-1 = ee = 0. 
This gives, according to (92 b), 
Cwm = —3+V(N+}), 
and, according to (92a), 
Crm = $—(N+}) = —Crym's (93) 

as would be expected from the fact that the relation A*+ B?+C? = N 
determines the square of C. 

The difference Cy,+°—Cm-m: = m’—m’ is obviously an integral num- 
ber, J say, equal to the number of states with different values of C,,,, 
which are possible for a given value of N. We thus obtain the following 


condition for NV: 2,(N+}) = integer = J; 


that is, N = }(7?—1) = #(724+1)7—1). (93a) 
This expression reduces to the usual form 
N = l4+1) (94) 


if we put J = 21+1, ie. define J as an odd integer, giving for the 
limiting values of C,,,, 


Com = +h Onem = —l, (94a) 
ie. by (91) m’ = +1, m” = —l. We thus get 
Com =m, (94 b) 
and consequently 
h __ hm 2 y,_ 
M, = on o.. — Qn’ M = aa = qll+)). 


in accordance with our previous results. It is, however, important to 
3595.6 Q 


114 MATRICES §13 
notice that the matrix theory admits another possibility corresponding 
to I being an even integer, 2k say. We get, in this case, 


N = (k+4)(k—}) (95) 
and Cm'm = K-34, Cem? = —(k—-34), (95 a) 
whence Crm = m+ (95 b) 
with m’ = —/k and m” = k—1, or 

Cum = m—} 

with m’ = —(k—1) and m” = k. These results can be put in the same 
form as the preceding results if we define / as a half-integral angular 
quantum number a 


and m as a half-integral axial quantum number, varying between the 
limits +/ and —3. 

We shall then get, as before, C,,,,, == m. We thus see, by this example, 
that the matrix theory is, in a certain respect, more general than the 
wave-mechanical theory—at least in that form in which it has been 
developed hitherto. We shall give in a later chapter a generalization 
of it which provides an equivalent for the half-integral values of J and 
m of the matrix theory of the angular momentum. 

The non-diagonal matrices of the x and y components of the latter 
can easily be derived from (90), (91a), and (92a). We shall not, how- 
ever, examine the matrices M, and M, separately, but shall examine 
their combinations 


M. 


z 


+iM, = +R, M,—iM, = +8 


for the non-vanishing elements of which the following expressions are 
obtained 


(M+ iMy eam = sev 4) (m+ Heim (96) 
(If, iM, mas = geVU+4P—(m-+ De, (96.8) 


where a,, is an. arbitrary phase factor. 

A derivation of these results by the usual wave-mechanical method, 
i.e. by means of the integral expressions for the matrix elements, would 
require a thorough knowledge of the spherical harmonic functions 
Yin = Pin(8)e*? and would be much more laborious than the preceding 
calculations. 

The preceding method can also be applied to the calculation of 
the matrix elements of the coordinates xz, y, z and momentum com- 


§ 13 OSCILLATORY AND ROTATIONAL MOTION 115 
ponents p,, p,, p.—for such states at least as. differ from each 
other in the quantum numbers m and / only (and which in the case 
of the hydrogen-like atom belong to the same energy-level). To do 
this we shall examine first the expressions M,a—2M,, M,y—yM., and 
Dery 4 
M,z—2zM,. Since “= (M,«—2M,) =, (at on, and M, == xpy—YyPz. 
: Cpy 
we get | M..x] == —y, and in the same way [M,.y] == 4+-r, [M,,z] = 0. 
Putting 


r--ry = €, x-—-iy = 7 (97) 
we thus have ([M,,&] - -—y+ur > a(a-bay) => 1 
[Mg] = —y-ir = —i(e—iy) = —in 
or, with M, — hC'/2n, 
CE-EC > & Cqane: = (97 a) 
and Cz—2C — 0. (97 b) 


It follows immediately from these relations that, so far as the quantum 
number m is concerned (/ being left undetermined), z is a diagonal 
matrix with non-vanishing clements z,,,,, While € and » are matrices 
with non-vanishing elements of the form 


a s-J and Nm lan 


as in the case of the harmonic oscillator. 

Let us consider now the commutation relations between the quan- 
tities (operators, matrices) , 7 on the one hand and #&, Son the other. 
We have 

[,+1M,. €] = [M,.€|+ ff, €] 
= (M,,r]+{M,.y}+ qa, x]—[,,y] 


oe (eee) <= 1( —2z+2) = QO, 
Cpy Pr 
and similarly [M,—iM,, €] = —20, 
SE—ES a2 —22, (98 a) 


From the first of these equations we get 
(RE) natm-1 ca Rar sam er te (ER)masn-a ne FE esi I» 
1.6. Ensim _ Smm-1 -- const. = a, 
Resim , Rym-1 
and likewise from (98 a) 
22m = (£8) mm — (SE)inm = omm=2 Sye-1.m—S mntt Satin 
= 1 eer Sa-1m— Sint Biya a AL( BS) nm — (RS) mas weal? 


116 MATRICES $13 

We thus see that the non-vanishing matrix elements of the co- 
ordinates are determined, disregarding an irrelevant proportionality 
factor, by the matrix elements of the angular momentum. Substituting, 
in the preceding equations, the expressions for R,,_-; and Sim 
derived before, we get 2z,,,, = a{[(/+ 4)?—(m—})?]—[(/+ 4)*— (m+ })?}}, 


i.e. 
So = am 


\ 
lEms1ml = |Mmmerl = ay{(1+4)?—(m+4)} J 
In deriving these results it was tacitly assumed that the total momen- 
tum remained invariant, i.e. that the angular quantum number / pre- 
served the same value in the different states to which the matrix 
elements (98 b) refer. Affixing the index /, we should have written the 
latter in the more complete form 2) m:1ms €:msi;im ete. 

In order to find out the matrix elements which correspond to different 
values of J, we must take into account certain commutation relations 
containing the matrix of the total momentum, or its square N (x h?/47?). 
Taking, for instance, the relation 


NR—RN == 0 
(which follows from NA—AN = 0 and NB—BN = 0), we have, since 
N is a diagonal matrix with regard both to 1 and m (as a matter of fact 
not depending upon m), 


(NR— RN Ve pm’stym" = , > (Me astym’” Ry mest’ a Remy me N pms I"m") 


ary add 


(98 b) 


= (Ma—Ner) Bem = 9. 

We thus see that R,.,,.:-,.,- Vanishes unless 1’ = 1” as was assumed 
above. This assumption is therefore justified so far as the components 
of the angular momentum are concerned (it can be proved in the same 
way for S and C). It need not, however, hold for the coordinates, 
i.e. for the matrices &, 7, z. 

Taking, for instance, the (J’,m’;1”,m”)-element of (98), we have 


p> (Ry mstym Sem stem — St mist? Reymesem) = 0 


or Re m'st,m’-1 bpm! —r3t"m — Er mist 41 Ren 43;t me" = 0. 

Now it can easily be seen that the results derived from (97 a) and (97 b) 
as to the non-vanishing elements of £, 7, and z, so far as they are 
specified by the quantum number m, remain valid irrespective of the 
equality or inequality of the numbers l’ and 2” (since these results 
depend solely upon the diagonal character of C with regard to m). The 
preceding equation need therefore be examined only for the case when 


§ 13 OSCILLATORY AND ROTATIONAL MOTION 117 
m” = m'—2. Putting m’ = m+1 and m” = m—1, we get 

Ry msiztam Sr mct ai ae br m4, lan Re mm: (99) 
The angular quantum number / represents the maximum absolute value 
of the axial quantum number m. This means that the matrix element 
Ry ms+i,vzm Will vanish unless both |m| <I’ and |m+1| <I’; likewise 
Ey m,tm-1 Will vanish unless |m| < lI’ and |m—1| < 1’, further €) 41:1 
will vanish unless |m| < 1” and |m+1| <’, and finally Ry-nspm—1 Will 
vanish unless |m| < 1” and |m—1| <1’. Since equations (99) must 
hold for all values of m, both sides vanishing simultaneously, we can 
conclude that l’ and /” must be connected with each other in such a way 
that the violation of one of the conditions 


Im) <U, m+) <0, m—1] <0" 
will entail the violation of one of the conditions 
Im+1|, <I’, bal <4, lIm—1] < 1”. 


This will obviously be the case if 1’ = 1”, or l’ = 1"+-1, orl’ = I’—1. 
We thus see that only those matrix clements of £ will be different 
from zero for which I" = 0,41,--1. (99a) 
For otherwise we could, by a suitable choice of m, make one side of 
(99) vanish while the other would be different from zero. 
The same applies, of course, to the matrix elements of 7 and z, or, in 
other words, to the matrix clements of all the three coordinates. 
Putting in (99) I’ = 1 and 1” = l—1, and replacing the matrix ele- 
ments of R by their expressions (96), we get 
V{(l+ y— (m+ 4 i l-1,ym-1 = J{(- 4)?— (m— bY 3S m4 1; f-1,m> 
or 
V+ m+ 1)l—m) bpm; t-1ym—1 = VL m— 1) (L— m) mt; 114m: 
Replacing here the common factor J(i—m) by J(/+m), and taking 
into account that the expression (/+m+1)(l+m) is obtained from 
(l-+-m—1)(l+m) by replacing m by m+1, we can put 
bm4+1;t-1m = bJ{(l+-m)(l-+m-+ 1)}, (100) 
where b is a proportionality coefficient which does not depend either 


on / or on m. 
Substituting this expression in the equation 


aa 221m; l+1,m = Sim; lm+1 bv an+t: l-1m-_ Enm; l-1,m-1 8, —1,m—-1;l-1,m 


118 MATRICES § 13 
which follows from (98a), and putting 


Sim-1;3bm = Risctm-t = V{(L+ 3)?—(m— 4)?}, 


we get Ztm:t-1m = —b(l?@—m?). (100 a) 
In a similar way for the case l’ = /—1, 1” = 1 we obtain 

Eamaastm = 5'{(lL—m)(l—m—1)} (100 b) 

21—-1.m; lm = b',/(l2—m?), (100 c) 


where b’ is another coefficient of proportionality, which can be shown 
to have the same numerical value as b. 

It is interesting to compare the preceding resultst with the wave- 
mechanical method for the determination of matrix elements of the 
coordinates for a hydrogen-like atom. 

We have, for instance, 


——_ ~ * ’ 
Zn mi n'y’! { apm Parton’ dy ’ 


or, putting #7, = JS, (7)B,,(Ae?, dV = 1? drdw, dw = sin@ dédd, and 
z= rcos6, 


Zn lm: ni tym o> [ furs dr | Pay (9) Pe (9) cos 6 sin 6 dé i elim’ —mbp d¢. 
0 0 0 


We see, first of all, that on account of the last factor this expression 
vanishes unless m’ == m. In addition it can be shown that the second 
factor also vanishes unles l’ == 1+1. The proof is based on the fact 
that the product cos6@P,,(@) can be represented as the sum of two 
functions L,,,,(8) and P_,,,(8) with suitably chosen coefficients, and 
on the orthogonality of the functions Y,(8, 4) corresponding to different 
values of / [as characteristic functions of the operator Q? with the 
characteristic values —/(/+ 1)]. 

Replacing z by € = (x+iy) = rsin@(cos¢ + ising) == rsinOe'?, we 
get, in a similar way, 


ture nlm’ = [ faery dr | Pry (9) Pe in(8) sin?@ dé | ens ip dd. 
0 0 0 


The examination of the last factor shows at once that this expression 
vanishes unless m’ == m—1; the second factor vanishes likewise if 
1+). 

The conditions relative to m coincide with those obtained by the 
matrix method for z and £; the condition l’ = /+1 is, however, more 
restrictive, since it excludes the case l’ = 1, 

We see that here again, as for the values of / (integral or half-integral), 


t Derived in the above way by Born and Jordan. 


§ 13 OSCILLATORY AND ROTATIONAL MOTION 119 
the matrix method leads to results of higher generality than the wave- 
mechanical method. It should not be inferred that the results obtained 
by the latter are incorrect. On the contrary, it is the results obtained by 
the matrix method which require some qualification. The reason for 
this is that the properties of the matrices which represent the com- 
ponents of the angular momentum of an electron are not completely 
specific, but, as we shall see later, are shared by matrices representing 
allied quantities of a more general character, which can be considered 
as the resultant of the angular momentum due to rotation about a fixed 
centre and the so-called ‘intrinsic angular momentum’ of the electron, 
whose origin is usually ascribed to its spin motion. 

It is possible to generalize the wave-mechanical theory in such a way 
as to interpret this ‘spin effect’ and to incorporate the intrinsic momen- 
tum, allowing for the resultant angular quantum number or, as it is 
called, the ‘inner quantum number’ j both integral and half-integral 
values and allowing transitions, i.e. non-vanjshing matrix elements of 
the coordinates, for which this number changes by +1 or remains con- 
stant. This does not, however, invalidate in the least the fact that the 
angular quantum number J, representing the ‘orbital angular momen- 
tum’ of the particle, can assume integral values only and obeys the 
restricted ‘selection rule’ /’—/ == +1. 

The fact that we have obtained, by the matrix method, non-vanishing 
expressions (98 b) for the matrix elements of the coordinates in the case 
l’—l = 0 does not contradict the wave-mechanical theory, for these 
expressions contain a proportionality factor a, which has not been 
specified and which can easily be shown to be equal to zero in the 
case considered (if 1 denotes the orbital and not the total angular 
quantum number). 

The matrix elements of the coordinates which we have calculated 
have a direct and indeed very important physical significance. They 
determine, according to the formula 
6474 v3? 


Ann = a a IZnn'l*, 

where e denotes the electric charge of the particle, the probability of 
@ spontaneous transition with emission of light, i.e. they determine the 
intensity of the different lines in the emission spectrum of the corre- 
sponding system or the degree of their ‘blackness’ in the absorption 
spectrum [see Part I, § 13]. Such pairs of states n,n’ for which the 
matrix elements z, ,, vanish do not combine with each other, in the 


120 MATRICES § 13 
sense that transitions between them connected with the emission or 
absorption of light, corresponding to oscillations in the z-direction, that 
is to say, ‘polarized’ in this direction, are impossible. The relations 
between the quantum numbers which characterize the ‘allowed’ transi- 
tions (corresponding to the non-vanishing matrix elements) are called 
‘selection rules’. The latter, as we have just seen, can be different for 
different coordinates. For instance, in the case of the z-coordinates 
(i.e. of light polarized in the z-direction) they amount to l’—1 = +1 
and m’ = m, while in the case of the z, y-coordinatcs they are l’—1 = +1 
and m’ = m+]. 

This distinction between the different coordinates is a purely formal 
one in the case of a radially symetrical field of foree—because of the 
degeneracy connected with such a field. This degeneracy—with respect 
to the different values of m—can be eliminated, as will be shown later, 
by the presence of a magnetic field parallel to the z-axis (Zeeman effect). 
If the latter is weak enough, the preceding expressions for the matrix 
elements of z and of z+1y will remain approximately valid and will 
determine the intensity of the spectrum lines linearly polarized in 
the direction of the magnetic field or circularly polarized about this 
direction. 


14. Matrix Representation in the Case of a Continuous Spectrum 


We have limited ourselves hitherto to the matrix representation of 
physical quantities where the states concerned form a discrete set, 
corresponding to a discrete spectrum of the energy operator H. 

The case of a continuous spectrum corresponding to a continuous or 
‘mixed’ set of states specified by functions of the type #2 or ¥2,n,.n,, etc. 
(§ 11), can be dealt with in a similar manner. The matrix elements 
of any operator F are defined in this case in exactly the same way as 
in the preceding case, i.e. by integrals of the form 


Fog: = [ pet Pye. dv (101) 


or Fei ninis Cintns = J Pornins Feinings (10) a) 
and so on. 

These integrals as a rule do not converge, and are similar to the 
Dirac function 6(C’—C’) which was introduced and discussed in § 10, 
and to which the matrix elements of F actually reduce if F repre- 
sents the energy H or any other constant of the motion commuting 


$14 MATRIX REPRESENTATION WITH CONTINUOUS SPECTRUM 121 
with H and satisfying the equation Fy = Kc. We then get, accord- 


ing to (101), 
. Ping: = Re [ gttyt a, 
that is, FY, oe == Fe 8(C’—C"). (101 b) 


This expression corresponds to a ‘diagonal matrix’ of the discrete case, 
just as 6(C’—C”) corresponds to the unit matrix. 

The somewhat indefinite character of the matrix elements F?,,- can 
be removed in the same way as in the simplest case F = 1 when F?..¢ 
reduces to the function 5(C’—C”)—namely, by extending the integra- 
tion in (101) over a finite volume, and passing to the limit V > oo after 
completing the integration over C’ or C” which always occurs in 
problems of physical interest.| The simplest example of such a problem 
is the calculation of the probable value of some quantity F for a motion 
specified by a wave function of the type 


p= { Ao bo aC, (102) 


which can be considered as the superposition of a large number of 
‘wave packets’ corresponding to very small intervals of the parameter 
C. Although the integrals f |%,|*dV diverge, the integral f |p|? dV 
remains in general finite and can be normalized to 1, just as in the 
discrete case when 4 = }'c, ,. 

We have in fact, reversing the order of integration with respect to 
V and C, : 
fwitav = fade’ fag dC" f pteye- dV 
v ri c Vv 

= f af. dC" [ ag- dC" 8,(C"~C’). 
ce Cc’ 
Instead of first performing the integration with regard to C’ and C” 
and then passing to the limit V + 00, we can in this case replace the 
(perfectly definite) function 5,(C’—C”) at once by the Dirac function 
8(C’—C"), which gives 


{way = [ agag dc’. (102a) 
We thus see that the first integral converges along with the integral 


+ In some cases it is preferable to modify the definition of the wave functions ¢ s0 
as to make them vanish on a certain surface S beyond which the forces can be assumed 
to vanish. The problem is thus reduced to one characterized by a discrete spectruin. 
Such quantities as possess a direct physical interest are usually only slightly affected by 
the value of the volume V onclosed by S, so long as it is sufficiently large. Their exact 
values can be easily calculated by passing to the limit V + w. 

3595.6 R 


122 MATRICES §14 
§ |a,|*dC. The convergence of the latter can, however, always be 
secured by a reasonable choice of the function a>. The normalization 
condition thus reduces to the equation 


J lacltdc = 1, (102b) 
which replaces the equation > |a,,|? = 1 of the discrete case, and shows 
that the product lal? dC (102c) 


can be considered as the probability that the particle is in a state of 
motion specified by the interval (C,C+dC). 

The expression (102c) is of the same form as the expression ||? dV 
for the probability of a position specified by the volume element dV; 
in both cases we have to deal with continuously variable parameters 
(C or the coordinates x, y, z), and therefore in both cases it has a 
meaning to talk of probability with reference not to a definite state or 
position, but to a definite interval of states or positions, the probability 
in question being proportional to the magnitude of the interval. 

Subject to the condition (102b), the probable value of a quantity F 
can be defined by the usual formula 


F = | y*Fyav, (103) 
which can be rewritten in the form 
F = [ ade" [ ag-d0" [ 8 Foe dV, 
Cc’ On V 
ie. F= { f at, de Ferg AC'AC". (103) 


In the simplest case, when F represents a constant of the motion, we 
get, according to (101 b), 
F = { lag|2R, dC, (103 b) 


in agreement with the above interpretation of the product |a,|*? dC. 

If, however, F is not a constant of the motion, the integral (103 a) 
representing its probable value cannot be evaluated directly and we 
must have recourse to the method indicated above (first integration 
over finite volume, then over C” or both C” and C’, and finally passage 
to the limit V - 0). 

If the ‘C-space’ is subdivided into infinitely small intervals AC’, AC’, 
etc., and a wave packet is built up for each interval, according to the 


formula 1 


AC’—0 


§14 MATRIX REPRESENTATION WITH CONTINUOUS SPECTRUM 123 
we can replace the matrix components of F' with respect to the func- 
tions > by matrix components with respect to the ‘quasi-discrete’ 
functions ¥,. (normalized to unity): 


Fog = | U8 Foc dV. (104 a) 


The connexion between these matrix components and those discussed 
above is given by the formula 


; 1 race 
Foo == lim TAC'AC) { J Foo dC dC ’ (104 b) 
AC’ AC” 


whence it follows that the probable value of F can be written in the 
form xs - 
F=lim > p V(AC'AC") Fog: Ab Age. (104 c) 
AC” AC’ 


The matrix components—or elements—of a real quantity with respect 
to states of a continuous set must, of course, satisfy the Hermitian 


relations " 
Foo: = For 


just as in the case uf a discrete spectrum. 

‘Continuous matrices’ cannot be conveniently represented by asquare 
array of elements or components, such as are used for discrete matrices. 
This, however, does not invalidate the analytical results which have 
been established in § 11; the only amendment which they require con- 
sists in the replacement of the unit matrix 5,,, by the Dirac function 
5(C’—C”) and of summation with respect to discretely variable indices 
by an integration with respect to the continuously variable indices 
wherever the latter occur in the place of the former. 

This has already been illustrated by the preceding examples. In a 
similar way we get instead of (75) 


Phy = J Fo-che: dC", (105) 


and instead of (78) 
(FQ ee = { Fee Gee: AC (105 a) 


(multiplication law for continuous matrices). 

The seemingly unimportant formal difference between the continuous 
(or mixed) and discrete case is connected, however, with a fundamental 
difference in the physical meaning both of the wave functions and of 
the matrix elements. The essence of this difference consists in the fact 
that, while to states belonging to a discrete set there corresponds in 
classical mechanics periodic or quasi-periodic motion in a limited region 
of space, states belonging to a continuous set correspond to aperiodic 


124 MATRICES § 14 
motions of the classical theory, i.e. to types of motion for which the 
kinetic energy remains positive at infinity and which approximate there- 
fore at infinite distance (so far as the forces vanish there) to free motion. 

Motions of this type were not considered in the old quantum theory. 
The latter did not encroach upon the holy laws of classical mechanics, 
but merely added to them certain quantum restrictions when the motion 
was confined to a limited region of space and accordingly displayed 
certain periodicities corresponding to the many-valuedness of the action 
function S. As already shown above, Bohr’s quantum conditions 
amounted to the condition of single-valuedness for the function e275", 

In the case of aperiodic motions, starting at infinity and ending at 
infinity, the action function S remains single-valued, so that quantum 
restrictions of any kind are unnecessary. 

The coordinates of a particle describing such an aperiodic motion, 
considered as functions of the time ¢, cannot, of course, be expanded 
in a Fourier series. The latter can be replaced, however, in this case 
by a Fourier integral. Limiting ourselves, for the sake of simplicity, 
to motion in one dimension, e.g. parallel to the x-axis, we can write 
instead of (79), § 12, des 


x(t) = J 2(v)el2nnt dy, (106) 
and instead of (81 b) — 
a(t) = | 29.,,, elu" dy”, (106 a) 


where 2°.,, = x°.(v"—v’), the product 2z?.,,dv” replacing the amplitude 
Imni ¥ = W'/h is the frequency associated with the energy W = W’, 
which is supposed to be the energy of the motion represented by (106 a). 
As to the frequency v” = v’+-v, it is natural to assume that it coincides 
approximately with W"/h, where W” denotes the energy of a state, a 
transition from which to the state W’ corresponds, with regard to fre- 
quency and intensity of the emitted light, to the element 2°.,, e'27"-»'¥ dy” 
of the integral (106a). The question of the degree of approximation 
between v” and W°/h (if v’ = W’/h) has no definite meaning in the 
present case with a continuously variable W, for equations (80), (80a), 
(80b), and-(80c) cannot be applied to it, the integrals $ referring to 
‘round trips’ only. We are therefore entitled to assume that v” coincides 
exactly with W”/h, i.e. that there is not only a ‘correspondence’ but 
an actual identity between the classical frequencies occurring in (103) 
and the quantum frequencies (W”"—W’)/h. The responsibility for the 
disagreement between the classical and the quantum theory can thus 


§14. MATRIX REPRESENTATION WITH CONTINUOUS SPECTRUM 125 
be shifted entirely on to the amplitude coefficients x°.,, which can be 
supposed to ‘correspond’, i.e. to be approximately equal to the matrix 
elements of x with regard to the states W’ and W” 


[ abot ys. de. 
The correspondence with these elements can actually be established 
with the help of the approximate expressions of the wave functions ~J° 
in a way similar to that used in § 12 for the case of a discrete spectrum. 
We shall put accordingly 


Bcws Se eitmtagz)W, tlh, (107) 
Vv 


where the coefficients C’,, must be determined by the condition 


[ Yor ye. dar = 3(v’—v"). (107 a) 
Taking into account the relation 
8°.(x)—8°.(x) = (W.—W,-)t, (107 b) 
which can easily be shown to hold approximately (for two states not 
far removed from each other) irrespective of the periodic or aperiodic 
character of the motion,t we get in the case of neighbouring values of 
v’ and v”: 


+0 +00 
Fe. = [ F(a ypot yp. dx = JCC, | F(tje-%—MeMh dt. (108) 


On the other hand, the Fourier coefficients in the integral representing 
a function F,(t): a 
B= | Pevete-vvay 


are determined by the formula 
+o 
Fey = f B,(Qe- de, (108 a) 


which coincides with the’ preceding expression for F,,, if we put 


+ Cf. § 12. Since in the present case the integral J = $9 dz is non-existent, we can 


t direct] 4.0 
ical 8°(c)—8?.(2) = CW — W,’). 


We have further, from the definition as =g = (2m(W—U))}, 


aris) sy | Vigo W— Uy) de = 


in the same way as before.—The relation (107 b) can be proved in a somewhat more 
complicated manner for the general case of a (non-periodic) three-dimensional motion. 


126 MATRICES § 14 
v"—v' = (W"—W’)/h and C2 = 1. The latter condition can easily be 
shown to follow from (107a). In fact the main contribution to the 
integral (107a) must be due to distant points where the functions 8°(z) 
reduce to gx with a constant value of g (corresponding to a constant 
potential energy). Replacing g by hk, where k is the wave number, we 
get, ore to (23a), Chap. I, 
+0 
C,.C, ‘sg 
fi por pe. dx =~ = eee) -) elan(k’-k’)x dz 
_ Cy CAs 
‘V0, %,) 


—-@ 


— k”) = &(v’—v"), 


whence 


fe oid ie — kde" == fe (Sr) 80 —k') dk’ = 1, 
] 


or, since f 5(k’—k") dk” = 

Ch de _ 

v, dk 
Taking into account the relation v = hk?/(2m), we get 

dv/dk = hk/m = v,, 
(group velocity = corpuscular velocity) and consequently 
C2 = 1. 

The integral (108) expressing the Fourier components of a function 
F(t) converges and has a definite value only when this function 
vanishes for ¢ = +00. This condition is not satisfied for most of the 
quantities referring to aperiodic motion. In the simplest case of uniform 


+a 
motion we have, for instance, z = vt and z°.,, = v,, f te-#7"-»" di the 
—2 


integral obviously diverging. If, further, F denotes a constant of the 
motion—e.g. the energy H—we get 


4.0 
H?., = W,, i e—t2nv"-v'h dt a W,.8(v"—v’), 
ao 


in exact agreement with the result (101 b) obtained from the matrix 
definition of H%.,,. 

These considerations give a new explanation of the fact, already 
mentioned, that the matrix elements of various quantities in the 
case of a continuous energy spectrum do not in general have definite 
values, being expressed by non-converging integrals over oscillatory 
functions of the e**7*= type. 


IV 
TRANSFORMATION THEORY 


15. Restricted Transformation Theory; Matrices defined from 
different ‘Points of View’ 


Let us consider two operators H and K which we shall assume to 
represent the energy of the same particle moving in different fields of 
force with the potential-energy functions U(x, y,z) and V(z,y,z), both 
being independent of the time and limiting its movement classically to 
a finite region. 

The characteristic values of H, which in this case will form a discrete 
set, will be denoted by H’ or H”, etc. (the dashed letters referring not 
to a particular characteristic value, but to any one of them). The 
corresponding characteristic functions will be denoted by 


Vin = Viele, ysayeri2enm, 
etc. A similar notation will be used for the characteristic values K’ 
and functions dx = $%(x, y, z)e-27*"" of the operator K. 

If there is no degeneracy, the functions ¥,,- will be completely speci- 
fied by the attached value of the operator to which they belong. In 
case of degeneracy we must add to the energy operator one or two 
other operators, representing independent constants of the motion, for 
example the z-component of the angular momentum Y, and its square 
M? if the potential energy U depends upon the distance r alone (central 
field of force). To avoid unnecessary complication, we shall in such 
cases understand by H the sct of all these three mutually commutable 
operators H,, H,, H,, and by H’ a set of their characteristic values 
H;, H,, Hy corresponding to the same function py = pq. y. 7, (in 
the sense of the simultaneous validity of all the three equations 
Ay by = Aida, Hedy = Aydby, Ayy = H34, which we shall write 
as a single equation Hy,,, = H’y;,). The same remark applies to the 
operator K, its characteristic values K’, and its characteristic func- 
tions $x. 

In addition, let us consider some quantity represented by an operator 
F and let us introduce its matrix representation with the help of the 
functions y,,; on the one hand and of the functions ¢x. on the other. 
We shall thus get two different matrices which we shall denote by 
F, and F, respectively and refer to as the matrix of F ‘from the 
point of view’ of H and the matrix of F from the point of view of K. 


128 TRANSFORMATION THEORY § 15 
The components (or elements) of these matrices will be denoted by 
Far ae (F3pq-) and Feoge (F}-x-). We shall thus have 


Fyre = [bbe Pb WV, Pope = [ Oe Pye aV 


(109) 
Fog = [bE Fox-dV, Phx = | ot F4g-aV 
with 


eitm(H’—H"Xh ei2mK’-~KMh (109 a) 


= 0 — 0 
Fay — Pen Fy. Tat Pye K- 


In particular we shall have 

Bove Dona, Regt 2 Kh Sees (109 b) 
since H and K are diagonal matrices from their own point of view, 
the elements of these matrices being identical with the respective 
characteristic values. 

The transformation theory in its simplest form consists in the estab- 
lishment of a certain connexion between the two ‘points of view’, i.e. of 
certain relations between the functions %,, and the functions ¢,-, a8 
well as between the matrices F,, and F,. With the help of equations 
(109), the second part of this problem can be reduced to the first. 
However, we shall see later that it can be solved independently without 
the use of the functions ¢ and ¢, on the basis of the conditions (109 b). 

The fundamental assumption of the transformation theory is that 
the amplitude functions ¢%.(z, y,z) can be expressed as linear combina- 
tions of the amplitude functions ¥?,(z, y,z) according to the equation 


ok = »2 Cay Ke Wir (110) 
+ 


with constant coefficients a;,,;._ We shall not try to justify this assump- 
tion on formal grounds for the general case of any operators H and K 
but shall be content with the following remarks. 

(a) The assumption (110) leads to an unambiguous determination 
of the expansion coefficients a,,~-._ Indeed, multiplying (110) by 9% 
and supposing the different functions ¥,,- to be orthogonal to each 
other (which we can always do), we get upon integration 


It is clear from this that equation (110) can hold only when the sum- 
mation is extended over all the values of H’, i.e. over all the stationary 
states, defined by the operator H (and those representing other in- 
dependent constants of the motion, if there is degeneracy). 

(6) For our assumption to be justified it is necessary and sufficient 


§ 15 RESTRICTED TRANSFORMATION THEORY 129 
that the series (110) with the coefficients determined according to 
(110 a) should be convergent. 

Wo shall argue in future as if this convergence condition were 
satisfied. It can be shown to be actually satisfied in most cases of 
practical importance corresponding to a small difference between K 
and H due to some weak ‘perturbing’ forces. In this particular case 
the transformation theory we are developing reduces to the so-called 
perturbation theory. 

If the transformation (110) holds, then the reciprocal transformation 


by = Ps Onn Pk (111) 
must also hold with the coefficients 
an = i bot ye, dV. (111 a) 
Comparing this with (110a), we get the relation 
an ae (112) 


On substituting the expressions (111) in (110) or (110) in (111), we 
get—in the first case— 


$k = p2 On K: * On Wn OK = pa (5, On Onn) $k 
i.e. > at tx: — 8 eeK’s (112a) 


and in the second case 


D3 Oy Ay = Oy py (112b) 

Replacing a;),, by afi. according to (112), we obtain the relations 
P Ben PY 4k & 

p Oy pK: = 8K K- (113) 

p Oyen Oy = Opry’, (113 a) 


which express the orthogonality and normalization of the coefficients 
yy K: (OF Ag" 4’). 

Another—equivalent—form of these relations is obtained by multi- 
plying ¢% in (110) by its conjugate complex and summing over K’. 
This gives Dh oe = Pe Cire UK Wir Fir i.e. according to 


eae SO bk = 3 Ev. Gish) 
A’ i’ 


Before proceeding further in the formal development of the theory, 
we shall examine the physical meaning of the assumption implied by 
the transformation equations (110) and (111). 

' It should be noticed first of all that the latter have an external 


3595-6 s 


130 TRANSFORMATION THEORY §15 
resemblance to the representation of the general solution of the wave 
equation (1 + =, 5) = 0 in the form of a sum of its particular solu- 


tions, i.e. to the equation 
WEY 208) = ¥ Corba = F Cyr Pyp erm, (1130) 


The fundamental difference between the two cases is that the time ¢ 
enters as an essential factor in equation (113 c), while the transformation 
equations (110) or (111) do not contain it at all. If, however, we put 
in (113) ¢ = 0 ort = ty, i.e. consider the function y¥ at a definite instant 
of time, we see that by a suitable choico of the amplitude coefficients 
C, it can be made to coincide with any one of the amplitude functions 
$%, 80 far as the latter are actually expressible by a series of the type 
(110). The physical meaning of the assumption implied in formula (110) 
is that any stationary state defined by the operator K, according to 
the equation K¢,. = K’¢x- can be represented as a superposition of the 
alternative states defined by the operator H (according to Hy,,, = H'p,,-) 
at a certain instant of time. Such a coincidence, even if achieved at 
a definite instant ¢ = ¢,, will, however, not persist unless the coefficients 
Cy are allowed to vary with the time in an adequate manner. In this 
case the function y defined by (113) will no longer represent a genera! 
solution of the equation (x + =. 5) = 0; it seems, however, natural 
to suppose that, with a suitable definition of the functions C,,(t), 
it will represent the general or a particular solution of the equation 


(K+5- 5) =0. 


The latter assumption reduces to the equation 


bx = 3 Cx (bn (1134) 
or $i eta Kh > Cx (toy enh, 
which becomes identical with (110) if we put 
Cre t) = Ogg ge ef -K Mh, (113e) 
In the same way we can replace the equations of the reciprocal trans- 
formation (111) b se 
a i by = > Ce ylt)ox: (114) 
with Czy = Ox et2n( K’-HWh, (1 14 a) 


We thus see that our fundamental assumption as to the existence of 
a linear relation (110) or (111) between the amplitude functions ¢{. and 


§ 15 RESTRICTED TRANSFORMATION THEORY 131 
~%, is equivalent to the assumption that the same motion, whether it 
be determined by an energy operator H or K, can be described from 
the point of view of the other operator, in the sense that a stationary 
state of the sct determined by K (or H) can be represented as a super- 
position of stationary states determined by H (or K) with variable 
amplitude coefficients Cy, (or Czy’). 

If the latter were constant, then (113c) would represent some general 
h @ 
Qi t 
sibility of finding the particle in one of the alternative (mutually 
excluding) states of motion defined by the different functions py. 
The coefficients C,-,-, provided they satisfy the normalizing relation 
p \Cy-x|? = 1, would in this case represent the ‘probability ampli- 


tudes’ of the different alternative states y,,, the probability of these 
states being equal to the square of the moduli of C),-;-. 

It is natural to preserve this interpretation in the present case when 
the Cj, are functions of the time defined by (113b). This dependence 
upon the time does not affect their moduli, which remain constant and 
equul to the moduli of the transformation coefficients a,,.-—the nor- 
malizition condition > |Cj;-,-|? = 1 being satisfied in virtue of the 
relations (113) (with K” = K’‘). 

In defining the quantities !C,,,|* or |a@;,7-|* as the probabilities of 
the different states of the H-set, we must not forget that all these states 
are associated with a definite K-state, as indicated by the second sub- 
script in @;,,. The quantity |a,,-;-|? is not to be regarded as the 
probability of the state H’ per se irrespective of any accessory con- 
ditions—for such unconditioned probability has no definite value—but 
as the probability of the state H’ subject to the accessory condition 
that the particle is actually in a state of motion specified by value K’ 
of K or by the function ¢,.. 

Instead of talking of the states as described by the wave functions 
dx: or x;,, it is often more convenient to speak of the values of certain 
quantities F, H, K associated with these states. The fact that a definite 
atate is actually realized can be expressed by saying that the probability 
of this state is equal to unity. We can thus say that |a,,-,-|* is the 
probability that the quantity H has the value //’ if it is known (with 
a probability amounting to certainty, ie. equal to unity) that the 
quantity K has the value K’. 

It is perfectly natural that the determination of the probability of 
a certain value of some quantity, e.g. H, must imply an assumption 


solution of the equation (u + y = 0 corresponding to the pos- 


132 TRANSFORMATION THEORY § 15 
about the probability of a given value of some other quantity K—for 
the probability theory does not create probabilities, but only correlates 
them. 

From the relations (112), it follows that |ay.°|? = |axz’|?. This 
equation can be interpreted from the probability point of view as the 
expression of the ‘reciprocity law’, which means that the probability 
of H having the value H’ when K is known to have the value K’ is 
equal to the probability of K having the value K’ when H is known 
to have the value H’. 

This feature of the coefficients a,,,, reveals a close similarity between 
them and the amplitude functions ¥%, (or ¢%-). As a matter of fact, 
the latter also depend upon two arguments, or sets of arguments—one 
of them, 2, y,z, specifying the position and the other, H’ (or H{, H;, H3), 
the energy and some quantities commuting with it (i.e. representing 
constants of the motion defined by the energy operator H). Further, 
the function |/9,(z, y,z)|?, or more exactly its product with the volume- 
element dV, does not determine the probability of a position specified 
by dV irrespective of any other circumstances, but subject to the 
explicitly stated condition that H is known to have the value H’. To 
give an adequate formal expression to this analogy between the coeffi- 
cients @;,., @_'y on the one hand, and the functions y},(z, y,z), 
$%-(x, y,z) on the other, we shall introduce for the latter the following 
—— PH (2, y',2’) saad Yen's Cl y',2’) ced WK (115) 
using x’ to represent a set of values of the three coordinates x, y, z in 
the same way as H’ or K’ is used to represent a set of values of the 
three quantities H,, H,, H, or K,, Kz, Kg. 

The analogy between the functions ¥2.,, and the coefficients a,x. 
or @;;';7 seems to indicate that a set of values of the coordinates z (2, y, z) 
can specify a ‘state’ of the particle just as well as a set of characteristic 
values of any other three mutually commuting operators H,, H,, H, or 
K,, K,, Ks. We are thus led, in a very natural manner, to revise the 
conception of a ‘state’ or ‘stationary state’ which we have been using 
hitherto, in the sense that it is not determined by a function ¥°%,,. or 
¢° x, Which refers to two states of two different sets like the trans- 
formation coefficients—or probability amplitudes—a;,’,, and ay, but 
simply by the values of three quantities (corresponding to the three 
degrees of freedom) which are represented by three independent mutually 
commuting operators such as the three spatial coordinates of the particle, 
or its energy, z-component of the angular momentum, and square of the 


§ 15 RESTRICTED TRANSFORMATION THEORY 133 
latter (in the case of a motion in a central field of force), and so on. 
A ‘state’ defined in this more general way must no longer be necessarily 
associated with the idea of motion. As a matter of fact the idea of 
motion—in the sense of a change of the position with the time—has no 
meaning in wave mechanics, being replaced by the idea of the proba- 
bility of finding the particle in a given position when its energy and 
two other quantities commuting with the energy have. given values. 
The functions ¥2.,,, do not have to be associated with motion any more 
than the coefficients a;!,,.. They are to be interpreted simply as the 
probability amplitudes for a state defined by the position 2’ (or volume- 
element dV’) subject to the condition that H == H’, just as the coeffi- 
cients a;;',,, determine the probability of the value K’ of K if H is 
known to have the value /’. 

It should be remarked that in all these considerations the time does 
not play any role whatever so long as it does not appear explicitly in 
H or in the other operators concerned. 

We are thus driven by the inner logic of the ideas embodied in the 
wave-mechanical theory to consider it as a special case of a general 
physical theory—let us call it quantum mechanics—whose problem 
consists in determining the probability of a certain value of some 
quantity or of a set of quantities when a set of some other quan- 
tities is assumed to have given values. This general problem reduces to 
the usual wave-mechanical problem when the first three quantities 
are the coordinates of the particle, and the second three are its energy 
and some other two quantities which are represented by operators 
commuting with the energy operator. 

The condition that the three quantities of each set—those whose 
values are supposed to be known or those for which the probability of 
certain values is being determined—should be represented by mutually 
commuting operators seems to be essential for the problem to have a 
physical meaning. It is customary to express the possibility of fixing 
simultaneously the value of two or more quantities by saying that they 
can be simultaneously observed or measured; this can be regarded as 
the experimental equivalent for the mathematical idea of ‘mutual com- 
mutability’, connected with the operator or the matrix representation 
of the quantity in question. I should like, however, to warn the reader 
against the conclusion, often implied in the above expression, that in 
discussing elementary phenomena, we must keep in mind the observer 
or experimenter as an essential part of these phenomena, supposed to 
be responsible through his interference with them for the indeterminate- 


134 TRANSFORMATION THEORY § 15 
ness by which they are characterized—and which, as a matter of fact, 
is only revealed and not produced by his observations. 

This indeterminateness constitutes the characteristic feature of the 
new quantum or wave mechanics, which distinguishes it from classical 
mechanics. In the case of a particle moving in a given field of force 
with three degrees of freedom, the classical mechanics assumed the 
possibility of fixing simultaneously the values of siz quantities—for 
instance, the three coordinates x, y, z and the three components of the 
momentum g,, g,, g, (or the energy H, the z-component of the angular 
momentum M,, and the square of the latter 1/?), whereby the motion 
was completely determined—while the wave or quantum mechanics is 
less ambitious and restricts the number of quantities whose values can 
be fixed (arbitrarily, or by observation) to three, making up for the result- 
ing incompleteness or indeterminateness in the description of the motion 
by probability considerations as to some other set of three quantities. 

Another distinction between classical and quantum mechanics which 
must be borne in mind refers to the role played by the time. In the 
former case this role seems to be much more fundamental and important 
than in the second. As a matter of fact, the time seems to have been 
completely eliminated from the scope of the quantum mechanics as it 
has been specified above. This is, however, not quite true. First of all 
the time enters implicitly in the definition of such quantities as the 
components of velocity (or momentum) and various functions of them 
(such as energy, etc.), although these quantities are represented by 
operators which do not contain the time explicitly. And secondly we 
have supposed from the very beginning of this section that the potential 
energy of the field of force in which the particle is supposed to move 
does not contain the time explicitly, i.e. a depends upon the coordinates 
alone. It is only subject to this condition that the time can be practically 
eliminated from the theory; it becomes, however, a vital element of 
the latter when the potential energy is » function not only of the 
coordinates but also of the time. In this case Schrédinger’s equation 

h a 
(4+3n8 
yb = Peau with ,(2, y, z) satisfying the equation yj, == H'py . 
Characteristic values of the energy do not exist, or putting it in another 
way, values of the energy, if it is not a constant of the motion, cannot 
be measured, and tlie question of determining the probability of an 
arbitrarily chosen position 2’(z’,y’,z’) for a given (supposedly known) 
value of the energy becomes meaningless. 


) = 0 does not have particular solutions of the form 


§ 15 RESTRICTED TRANSFORMATION THEORY 135 

We shal! now come back to our original assumption, that neither H nor 
K contain the time explicitly and that they possess a discrete set of 
characteristic values H’(H;, H;,H,;) and K'(Kj, K,, K;) which determine 
two discrete sets of ‘states’. We have been led to the conclusion that the 
coordinates of the particle can be used for the definition of a third set of 
states, specified merely by the position of the particle in space. Since any 
values of the coordinates x’(x’, y’,z’) are possible, these values can be re- 
garded as constituting a ‘continuousspectrum’. This distinction between 
H and K on the one hand, and z on the other hand is reflected in the fact 
that in determining the probabilities we must speak of definite values of H 
and K and of a definite range of the values of x, i.e. of a volume-element 
dV in which the particle is suppused to be situated. We thus have the 
expressions: |@,,-, |* for the probability of H == H’ if it is known that 
K = k’, or of K = K’ if it is known that H = H’; |¥?.,,|? dV’ for the 
probability that x is enclosed in the range (z’,z’+ dz’) if it is known 
that H = H’ (dV' = dx'dy'dz’); \¢%.,--|2 dV’ for the probability that z is 
enclosed in the range (x’,x’+-dz’) if it is known that K = K’. 

Generalizing the reciprocity law which has been established in the 
case of |a,7-,-|*, we can define |, |? dV’ and |f2,-|* dV’ as the proba- 
bilities of J] = H’ or K = K’ when it is known that the particle is 
located in the volume-element dV’. 

The similarity between the functions y?.,,- or 49. and the coefficients 
ax! OF yx is revealed also by the fact that they satisfy similar 
orthogonality and normalizing relations, which in the former case are 
expressed either by means of integrals (over x’) instead of sums (over 
H’ or K’) or by functions 8(x’—z”) instead of 83-7; or 8-,.-—corre- 
sponding to the fact that H’ and K’ form a discrete and 2’ a-continuous 
set of values. We have, namely, the relation (113a), which can be 
written in the form 63 az) Azle = Byyye, 
and to which there correspond the usual orthogonality and normalizing 
relations for the ‘wave function’ ¢ 

[ead dz! =S8yy- dx" = dV’). (116) 
Besides the preceding relation, the cocfticients ax’, also satisfy the 
‘reciprocal’ relation (113) or 
> Oxy OR ye = OK Ks 
to which an analogue is found in the relation 
p) Pu Wn = o(2’—x"), (116 a) 


136 TRANSFORMATION THEORY § 15 
where 8(x’—x”) is an abbreviation for the product of the three Dirac 
functions 8(z’—z”), d(y’—y”), 8(z’—z”) (just as 6;-,- is actually an 
abbreviation for the product of the three expressions of this type for 
the three quantities implied in K). 

The proof of the relations (1164) [i.e. of their equivalence to (116)] 
is obtained by multiplying them by y2.,,-, where H” is any fixed value 
of H, and integrating over x”. This gives, in view of (116), 


J Zi Bear btn de" = 3 Bow | ee be ae” = bore 


which, according to the definition of the function 5(2”—z’), agrees with 
§ po-7,-8(x”—z’) dx”. The remaining difference between the probability 
amplitudes @5,-;--, f°, $2,» vanishes if we abandon our initial assump- 
tion as to the discreteness of the spectrum of H and K and suppose 
that one of these quantities, e.g. /7, has a continuous spectrum, being 
in this respect equivalent to z (the spectrum of A will be assumed for 
a while to remain discrete). 

The transformation equations (110) which, with our new notation, 
could be written in the form 


orK' = * Pou UK's 
must now be replaced byt 
wR = | Pe Ay AH". (117) 


Multiplying this equation by ¥°%,.. and integrating over w’(x’,y’,=’) 
(dx’ = dV’), we get 


[ etn Bx dx’ = | eux: aH’ { er We py A" = { iex-3(H’—H") dH’, 


that is Qy-K’ = | oe box: dx’ 
as before.{ Since the form of the reciproca] transformation 
tru = p> On PER’ (1 17a) 


remains unchanged (so long as K is supposed to have a discrete spec- 
trum), we get the previous relation between the coefficients a and a-!, 
namely, aj) ;7 = @};x’, leading to the reciprocity law |aj'y,-|7 = |ayx-|?. 


+ This transition is quite similar to a transition from a Fourior series to a Fourier 
integral, which as a mattor of fact forms a special case of the transformation or ‘expan- 
sion’ (117) and (117 a). 

¢ It should be noticed that the formor coefficient ag actually corresponds to the 
product of the present coefficient with dH’, this difference being compensated for by 
the differenco between tho previous and the present form of the orthogonality and 
normalizing rolation for pg’. , 


§ 15 RESTRICTED TRANSFORMATION THEORY 137 
Substituting the preceding expression in (117 a), we get 


2 pa Pin | Oty Ty x AH’, 
whence it follows that 
| Gite eux: dH" = 3y-K, 
or | Bere tyr AH’ = By-K. (118) 


This orthogonality-normalizing relation, which replaces (113), is 
identical with the corresponding relation for the function ¢°.,.., x’ being 
replaced by K’ [ef. (116)]. In a similar way (through substitution of 
(117 a) in the reciprocal expansion) we find the relation 


2 Oye Aye = 8(H'—H"), (118 a) 


which is the complete analogue of (116a) with a’ replaced by H’ and 
H’ by k’. 

If both H and A have a continuous spectrum, the relations (118) and 
(1184), as well as (116) and (116 a), are replaced by relations of the form 


| trace @ix: GH’ = 8(K"—K"), 
| her arx: dk’ = 6(H"—H"), 
[ Par Utne dx’ = 8(H"—H’), 


[ War Why dH’ = 8(x”—2'), 


etc., all the sums being replaced by integrals and all the 5,.,--numbers 
by 8(K’—K")-functions. All the transformation or expansion formulae 
acquire in this case the same form (117 a). 

From the complete analogy between a,,,, and #°.,,, or ¢°.;-,, it follows 
in particular that we must have, in addition to the equations 


Pr K ae gon QR’ dH’, Yen’ =~ J eK OK dk’, (1 19) 
the equation Qk = | Y=} 60, dar’, (1198) 


where #7) = 9%, In fact, this equation is nothing else but the 
expression (110a) for the coefficients a,,-~. We can thus consider this 
equation as a ‘transformation’ between the functions aj;x, and ¢2-x’, 
¥%--}) playing the role of the transformation coefficients, or as a trans- 
formation between the functions ay, and ¥%{!), the role of the 
transformation coefficients being played in this case by ¢2-x-. 

It should be mentioned that (1194) still holds when H and K have 


3595-6 T 


138 TRANSFORMATION THEORY § 15 
discrete spectra, equation (119) being replaced by (117) and its re- 


ciprocal _ 
P pea = Pp PK OR yy (119b) 


After we have thus settled the physical meaning of the ‘transforma- 
tion coefficients’ or ‘wave functions’ as the probability amplitudes for 
the values of one of the quantities concerned when the value of the 
other is supposed to be fixed, we obtain an extremely simple and 
illuminating interpretation of the various ‘transformation equations’ 
connecting these probability amplitudes. AJl these equations can be 
considered, namely, as the expression or rather the direct consequence of 
the addition and multiplication law of the new probability theory (which 
deals with the probability amplitudes in the same way as the old theory 
dealt with the probabilitics themselves). 

Taking the last equation, for example, we see that the product 
fA ;-',, can be interpreted as the probability amplitude that x will 
be equal to x’ if K = K’ and that at the same time K will have the 
value K’' if H is known to be equal to H’. Keeping the latter value 
as well as that of z fixed, and summing the products ¢°.,.a%!,, for 
all possible values of K, we must obviously obtain the probability 
amplitude of x == x’ subject to the assumption that H = H’, in agree- 
ment with (119b). 


16. Transformation of Matrices 

We shall now return to the beginning of the preceding section, i.e. we 
shall again assume the values of H and K to be discrete, and we shall 
examine the transformation equations for the matrices representing 
different quantities F from the point of view of H and K. Before doing 
this we must point out the fact that the transformation coefficients 
Qy~ and az!,, can also be considered as the matrix elements of a cer- 
tain matrix a and its reciprocal a-! respectively, in the same way as 
Fav OF F-x- are the matrix elements of Ff, or Fy. The main 
difference between them is that, in the latter case, the two indices 
(H',H” or K', K”) refer to states of the same set, defined either by H or 
by A, whereas in the former case the first index refers to a state of the 
one set and the second to a state of the other set. 

Another difference (closely related to the preceding one) is that while 
the matrix elements F,y.,- or Fy,4- are Hermitian, i.e. satisfy the 
conditions Fy.,x- = Fh-x, Fry: = FR, the coefficients (or matrix 
elements) aj. are not Hermitian, as shown by the relations (112). 

The matrix which is obtained from F (or a) by interchanging the 


§ 16 TRANSFORMATION OF MATRICES 139 
rows and the columns _is called the transposed matrix of F and is 
denoted (usually) by F. A matrix F* which is obtained from the 
transposed F by taking the conjugate complex of its elements is called, 
according to Jordan, the ‘adjoint’ matrix of F (‘conjugate imaginary’ 
according to Dirac) and denoted by Ft. Using this notation, we can 
write the Hermitian condition in the form 


Ft = F, (120) 
while the condition (112) can be written in the form 
at = q-}, (120 a) 


Matrices a satisfying this condition are called ‘unitary’, because the 
product of such a matrix with its adjoint matrix, which is the analogue 
of the square of the modulus of an ordinary complex number, is equal 
to unity (i.e. to the unit matrix). 

It is self-evident that the multiplication of the matrices of the 
type a which do not correspond to a definite ‘point of view’ (H or K) 
but serve to connect two different points of view must be performed 
according to the usual rule of matrix multiplication, i.e. by com- 
bining the rows of the first factor with the columns of the second. This 
means that the elements of the product of two matrices a and b must 


have the form 
(25) mn = 2 ank Duns 


i.e. that the second index of the elements of the first factor should 
coincide with the first index of the elements of the second factor, this 
common index being the index of summation. 

From the point of view of this definition, the product of a ‘mixed’ 
matrix such as a by itself or its conjugate complex a* would have 
no meaning, since the two indices refer to states of different sets, and 
therefore cannot be identified. We can, however, form the product of 
a with its transposed (@) or adjoint matrix (a‘), since the first index 
of the latter two refers to a state of the same set as the second index of 
the former and vice versa. The expression > Oy x Oxy can thus be 


considered as the (H’, H”) element of the product matrix aa which is 
of the same ‘pure’ type as the matrix F,. The same refers to the 
matrix aat or aa, if the elements of the reciprocal matrix a-) are 
labelled with the indices H’ and K’ in the order opposite to that which 
refers to the matrix a (as has actually been done in the preceding 
section). It can easily be shown that the matrix aa‘ is Hermitian (while 
aa is not). In fact, taking its adjoint matrix, which is obviously equal 


140 TRANSFORMATION THEORY § 16 
to the product of the adjoint matrices of the two factors taken in the 
reverse order, we get 

(aa‘)t —_ attat =— aat, 
in agreement with (120). 

It should be noticed that the two matrices aat and ata are, in general, 
entirely different, the former belonging to the same type as F,, and 
the latter belonging to the same type as F,. 

In the particular case of a wnilary matrix, satisfying the conditions 
(1208), we get 

(aa) K+ = ps Oey yy R = p Oy AK: > OKR: 

(aa") jy-34° = Pp? OnK key = p On Ke Oy = Syn 
according to (112a)-(113.a), or in matrix notation 
where 5,, and 8, denote the ‘unit matrix’ as defined from the ‘point 
of view’ of H or A. Neglecting the physical meaning implied in this 
difference one often identifies the two unit matrices and writes 

dat = ala = I, 

which occasionally can lead to misunderstandings. 

The possibility of treating the transformation coefficients as the 
elements of a (mixed) matrix and of applying to the latter the usual 
rule of matrix multiplication is substantiated by the results obtained 
in two or more successive transformations. Let LZ be an operator (or 
set of three operators L,, L,, L,) of the same kind as H or K, with 
the (discrete) characteristic values L’ and characteristic functions x}.. 
These functions can be ‘transformed’ to those of AK by means of the 
equations xf, = > bx-1/¢%, and further to those of H by means of 

K 


the equations $%. = }' ax}. Combining them together, we obtain 
a direct Ssticatonosdion from L to H, 

Xv = > Cri Yh 
with the coefficients c,,-,- == p> QipRK: Oxy. The matrix of these coeffi- 


cients is thus equal to the product of the matrices a and b taken in the 
order stated, and calculated according to the ordinary rule. Using the 
matrix representation for the transformation coefficients, we can thus 
define the matrix of two successive transformations as the product of 
the matriees of each of the separate transformations. This holds, in 


$16 TRANSFORMATION OF MATRICES 141 
particular, for the case which has been considered above, where the 
second transformation is the reciprocal of the first one. 

We can now turn to the main object of this section--the transforma- 
tion of the matrix representing the same quantity F in the transition 
from one ‘point of view’ specified by H to another, specified by K. 
Substituting (110) in the expression (109) for the elements of F,-. we get 


1) ae * é 
Kx = pp Or nn PF yy (121) 
which can be written in the form 
0 = t 0 
Fheipe =F 2 2 pay’ I wy: Gyr 


This expression can be interpreted, according to the matrix multi- 
plication law, as the (K’, A”)-element of the product of the matrices 
at, F,,, and a taken in the order stated. We can thus put 


Fie = OB, a. (121 a) 
Substituting (111) in (109), we get in the same way 


This equation can be obtained from the preceding equation if the latter 
is multiplied by a on the left and by at on the right side and if the 
relations ata = aat == | are taken into account. 

If we restrict ourselves to multiplying (121a) by @ on the left or by 
a' on the right, we get 

Fy, == aF, \ 

and Fiat --atK, | 
The product. matrices in these equations have all a mixed character, 
with elements of the type (//’. AX’) in the ease of the first and (A’. 1’) 
in that of the second. 

Written in matrix elements, these equations run 


(Fis @) px = % Bry Qy-K: = p> Qype Ege = (QP) ys 


(121¢) 


kM rn = 2 Pee Un = 2 Qe Fry = (Een: 


If in (121) we put, in particular, 7 = A or F = H, we get 
Ky, a ==> ak, aH,- = Hy, a, (122) 
and two similar equations with at instead of a. 
Taking the clement (//’, K’) of the first equation (122), we get, since 


> - 
K xx: as OKK" A ” » ry 122 
p> Kya Varn: = Ky x: (122a) 


142 TRANSFORMATION THEORY § 16 
In the same way we obtain from the second equation (122) 
Pa Oy K- Ageg = Ay A’. (122 b) 


The equations (122a) have exactly the same form for all values of 
K’. Dropping K’ as second index in the coefficients a, we can rewrite 
them as a single system of linear homogeneous equations (corresponding 
to different values of H’) for a set of variables a,,, 

Kin ty = Kay (123) 
with a parameter K’. 

This system of equations can serve for the direct determination both of 
the transformation coefficients a,,;- and of the values K' if the matriz 
elements of K,, are known. We have, indeed, as the condition of the 
compatibility of equations (123) the vanishing of the determinant, 


i , v ad 
iKiyy —K Kun Kua” 
Lad - 
Kap Kippy-—4 K yey 123 ) 
‘4 r v = 0 23a 
K HH’ K yy kK | | ae | : ( 


which is an equation for the determination of the possible values of 
K' (k", K”, etc.). To each of these values there corresponds a set of 
values of the variables a;,, which we can identify, under certain con- 
ditions, with the transformation coefficients @;%° (@y-xK, AyK’, ete.). 
These conditions amount to the relations ata = aat = 1, which can be 
shown to be verified if the solutions of (123) are normalized according 


to the equation 3 a, ar, = 1 (123 b) 


for every value of K’. 
Let us first of all make sure of the fact that the values K’ obtained 


from (1234) are real. To show this we take the equations 

D3 K yy Ay-g = Kay x:, 

he Kiya Qirr = K'*aiy x: 
(the first of which can be considered as an identity, resulting from (123) 
for a particular value of A’, and the second as its conjugate complex), 


multiply them respectively by af, and a,,,, sum over H’, and finally 
subtract one from the other. This gives 


, ‘* * 
(K —K ) p? On R' OnE’ 
= * ye * 
= Pp a K yn Oy Grr — pa Pp» Kirn: Gx tyg. 


§ 16 TRANSFORMATION OF MATRICES 143 
‘Taking into account the Hermitian condition KGa — Bye, we 
can rewrite the second double sum on the right side in the form 
rp? Kg QyK Uy-R Which becomes identical with the first double 


sum if we interchange the summation indices H’ and H”. We thus get 
aan a ee 
(A’— K’*) he Ay Ay = Y, 
or, since the sum pS BK Oy pee = p ZK |* Is essentially positive, 
AO = KO* ee @, 
This equation expresses the fact that A’ is real. 
If, in the preceding argument, we replace the second equation by an 
equation (identity) 
oe * ee Aa 
> Biri tiene = KB" diy g- 
corresponding to some value of A” different from A’, multiply it by 
Qy°x’, Sum over H’, and subtract from the first equation multiplied 
by aj, and also summed over H’, we get 


(A’— 4") p On K iy K* 


cis ay 9 * ”% * 
= 2D Awa tux GK: — 3 © Air ie Cres Ug” 
Wit Tir 


In view of Adyy- == Ay-y and the interchangeability of the summa- 
tion indices H’, 1”, the right side vanishes just as in the case A’ = K”, 
aah BEE eye QirK AK = 0, 
which, since K’— A” is assumed to be different from zero, reduces to 

2 taK: UK: = 0 
or Lk tan =0 (K" # K’). 
This relation expresses the mutual ‘orthogonality’ of the different sets 
of solutions of the system of equations (123). Together with the 
normalizing condition (123 b), it can be written in the form 

ata = 5x, 

whereby the identity of the coefficients a,,;-, obtained from equations 
(123), (1234), and (123 b), with those defined at the outset with the help 
of the wave functions #2, and ¢{. by means of equations (110a) and 
(11la), is demonstrated. 

At the same time we have demonstrated the possibility of effecting 
the transformation of the matrix FF, representing an arbitrary physi- 
cal quantity F ‘from the point of view of H’ (i.c. with regard to states 
defined by H) to the matrix Fy representing the same quantity 


144 TRANSFORMATION THEORY § 16 
‘from the point of view of K’ without the use of the wave functions 
characteristic of H and K, but by a purely matrix method, based upon 
the matrix representation of all quantities—including the key one K— 
‘from the point of view of H’. The transition from this point of view 
to that of K can be effected by means of the equations (123), (123a), 
(123 b), which determine the transformation matrix a, and further by 
means of equation (12la), giving the new matrix elements of any 
quantity F in terms of the old matrix elements. 

In view of the relation at = a-}, this formula can also be written in 


the form F, = 0-1F,a. (124) 
The transformation matrix @ can actually be defined by the condition 

a1k,,a == K,, (a diagonal matrix) (124 a) 
which leads, after a left-handed multiplication by a, to the equation 
Kya = ak yx, i.e. to the system of equations (123); the unitary character 
of the matrix a, expressed by the relation a'a = 1, can be considered 
as a consequence of these equations. 

A transformation of the type (124)is generally called a canonical matrix 
transformation. It has an interesting feature which does not depend 
upon @ being a unitary matrix (ic. satisfying the relation at = a-), 
namely, of leaving invariant all the functional relations between the 
original matrices, the same functional relations holding between the 
transformed matrices. This can be proved directly by putting in (124) 
F = E+Gor F = EG. In the first case we get, since Fy, = Ey+Gy, 

Fy = a-"(E7+G,)a = aHy,a+a3Gya = Ex+G@x; 
in the second case we have, using (HG), = Hy Gy, 
F, = athy Gy a. 
Now we can insert between H,, and G,, the product aa-, since it is 
equal to the unit matrix 5 whose product with any other matrix is 
identical with the latter (just as in the case of the multiplication of 
ordinary numbers by an ordinary unity). We thus get, by the asso- 


ciative law, Fy = (aE, a)(aG,,a) = Ex Gr. 


This proof can easily be extended by induction to any function F' of 
E and G, so that, putting (in the operator representation) F = f(H,G), 


won S(Ex, Gg) = af( By, Gy )a (124b) 
or a 'f( Ey, Gyz)a = f(a Ey, a, a-"Gy, a). 
It follows from these equations that, in particular, the transformation 


§ 16 TRANSFORMATION OF MATRICES 145 
(124) does not affect the validity of the commutation relations between 
the coordinates and the components of the momentum; the original rela- 
tions (p,2—xp,,) 1, = x 5,, are transformed into (p,7r—xp,) x. = ——. 8. 
2772 21 

Canonical transformations of the above type should be distinguished 
from canonical transformations of the variables x, y, z, p,, Py P, in the 
sense corresponding to the general definition of a canonical transforma- 
tion in classical mechanics (see § 5). In the former case the canonically 
conjugate variables are supposed to remain unaltered, the transforma- 
tion referring to the matrices only by which they are represented from 
the point of view of different energy operators (H or K). In the latter 
case, on the contrary, the variables z,..., p, are themselves transformed 
into a new set of canonically conjugate variables £, 7, f, 7g, 7,, my, the 
energy operator H,,, = H(zx,...,p,) remaining essentially the same and only 
changing its external form because the old variables defining it are 
replaced by their expressions in terms of the new variables. We thus 
get for it a new function, Hg say, of the variables €...., mz, Which is, 
however, numerically equal to H,,) for the corresponding values of the 
original variables. his numerical equality of the classical theory is 
replaced in quantum mechanics by the equality of the characteristic 
values of the operators H,, and Hig. The condition expressing the 
canonical] character of the transformation from the original variables 
to the new ones consists in the fact that the matrices representing the 
latter (from any point of view) should satisfy the same commutation 
relations meE—Er == h/2mi, etc., as those representing the old vari- 
ables. This means that the new matrices (of €,..., 7z) can be derived 
from the old ones (of ,..., ».) by a canonical transformation in the first 
sense, i.e. in the sense of the equation (124). The physical meaning of 
such a transformation will, however, be entirely different from the case 
to which (124) refers, the two kinds of transformation bearing but a 
formal resemblance to each other.—We shall come back to the trans- 
formations of the second kind in the next section. 

In the case of a degeneracy of the original energy matrix Hy, i.e. 
when some of its diagonal elements coincide, it is necessary to consider 
it simultaneously with one or two other matrices, which represent inde- 
pendent constants of the motion specified by H. We must therefore 
replace the operator H by the three operators H,, H,, H; and define the 
matrix representation of any quantity F from the ‘point of view’ of this 
‘trio’, writing Fi, 2,7, instead of F,,. The transformation matrix corre- 


sponding to a transition to the ‘point of view’ of some other trio, e.g. 
3596.6 U 


146 TRANSFORMATION THEORY § 16 
K,, K,, K3, will then be unambiguously determined by the simultaneous 
equations 

aK ya7,47,1)% = Kyx,x,K, 

Q Koa nyt = Keg xx, (124c) 

a Kyn nny? = Kyx,x,x) 
with the condition that all the three matrices on the right side should 
be diagonal (which can always be satisfied if the corresponding operators 
K,, K,, K; commute with each other). Each of the equations (124c), 
taken separately, will leave a certain amount of ambiguity in the shape 
of the matrix a, which can be removed by means of one or both of 
the others; if we do not desire a diagonal representation of the corre- 
sponding quantities we can remove this ambiguity in a perfectly 
arbitrary manner consistent with the condition a-! = at. 

The preceding considerations can easily be generalized for the case 
when either or both of the operators (or the operator trios) H and K 
have a continuous spectrum. Let us assume, for instance, that the 
values of H form a continuous sct, while those of K remain discrete. 
We then have, instead of (110) and (111), the transformation equa- 
tions (117) and (117a) with a semi-continuous transformation matrix 
Qyx Satisfying the orthogonality and normalizing relations (118) and 
(118a). The latter can be put in the same form, 


aat=$,, ala=S,, 
as in the discrete case, if 5,, is considered as a continuous unit matrix, 


i.e. as a Dirac function 
Syn = 5(H’—H"), 


while 5,-,- is the usual discrete unit matrix, and if, further, the matrix 
multiplication law is defined in the usual way corresponding to discrete 
matrices in the case of aat: 


(aa) p37 = ps3 An,’ Ah one, 
and in the way corresponding to continuous matrices in the case of ata: 
(a'a) geR- = J Aho On RK’ aH’ 
[ef. eq. (105 a), § 14]. 
We get further, instead of (121), 


Frere = ff ORR Aye RK Foye adH’dH”, 


or Pooge = ff dhe Poen-Quex dH'dH’, 


§ 16 TRANSFORMATION OF MATRICES 147 
which, as in the discrete case, can be written in the matrix form 
Fi, =a! Fi; a, 

it being understood that the matrix multiplication must be carried out 
according to the rule for continuous matrices whenever the ‘summa- 
tion’ indices are continuously variable. From this equation we can 
derive the equations (122), the second of which, when reduced to matrix 
elements, runs exactly as before [eq. (122b)], while the first assumes 
the form ’ 7 " 
| Kiva Oy, CH” = Kay x, 
instead of (122a). Dropping the index K’ of the coefficients a;,,., we get 

| Koy Qyz)° dH” == Kaz, (125) 


which can be considered as an integral equation for the determination 
of the functions a,,, and the characteristic values K’, replacing the 
system of algebraic equations (123). The result of the elimination of 
the functions a,,- from (125) cannot be written in the form of a deter- 
minant (1230) unless we adopt a generalized definition of ‘continuous 
determinants’ corresponding to continuous matrices. Writing the right 
side of (125) in the form f K’a,,-8(H”—H’) dH", we could then replace 
the compatibility equation (1234), which serves for the determination 
of the characteristic values of K (A’ = Ay x), by a symbolic equation 
of the type 

(K yn-—K'3(H’ —H")| = 0, (125 a) 
indicating the general element of the determinant. In the corresponding 
notation for the discrete case, equation (123a) would run as follows: 

[Kapa —K'8y-y7| = 9. 

Of course (125 a) cannot be used for the actual calculation of the values 
K’; but this is also true of equation (123a), since it refers to a deter- 
minant which consists of an infinite number of discrete elements. 

We shall indicate later the method which can be used for the approxi- 
mate calculation of the admissible values of K’ when K differs but 
little from H (as is the case in problems of the perturbation theory). 
It should be remarked here that both for a discrete and a continuous 
spectrum of H the characteristic values of K may form a discrete as well 
as a continuous spectrum (contrary to the assumption which was made 
at the beginning about tho discreteness of the K-spectrum). 

It can easily be proved that if the functions a@,, (‘characteristic 
functions’ of the integral equation (125)] corresponding to a particular 


148 TRANSFORMATION THEORY § 16 
value K’ are labelled with this value as second index, they will form 
an orthogonal set—discrete or continuous, together with the set of 
values of K’—and normalizable to unity, i.e. satisfying the relations 


| ura Ofer dH’ = SEK or 8(K'—K”) 
and > OK Aye or | Qa OK: dK’ = 5(H’—H") 
kK’ 


as the case may be. 

The proof is obtained in exactly the same way as in the case of a 
discrete H’-spectrum dealt with above and therefore will not be repro- 
duced here. It should be remarked incidentally that the results referring 
to the latter case must be amended to allow for the possibility of K 
having a continuous spectrum with K;-.,.- = 6(K'’—K"). 

Summing up, we can say that both with a discrete and a continuous 
spectrum of the ‘basic quantity’ (or basic trio) H, it is possible to 
calculate the matrix elements of any quantity F from the point of view 
of some other ‘basic quantity’ (or basic trio) A, without the knowledge 
of the characteristic functions of either H or K; the only thing which 
it is necessary to know in order to carry out the transformation from 
F,, to F,- is the matrix Kj. The transformation coefficients a,,.,. can 
be found from the condition that K ,- is a diagonal matrix of the discrete 
or of the continuous type (which need not and cannot be specified 
beforehand). 


17. Transformation Theory of Matrices as a Generalization of 
Wave Mechanics; Transformation of Basic Quantities 


It thus appears that the matrix theory, so far as the transformation 
from one point of view to another is concerned, can be considered as 
a logically closed self-supporting structure, which does not need the 
wave-mechanical basis upon which we have built it up. We have 
already met with a similar situation in the preceding chapter, when we 
were discussing the question of the actual determination of the matrices 
corresponding to a given energy operator and found it possible to 
achieve this result by determining the fundamental Hermitian matrices 
of the coordinates and the momentum-components in such a way as to 
make the energy matrix diagonal subject to the commutation conditions 
p,t—xp, = h/27mI, ete. 

In the light of the transformation theory developed in this chapter, 
it appears, first of all, that if the latter problem has been solved for 
some simple type of motion specified by the energy operator H, it can 


§17 GENERALIZATION OF WAVE MECHANICS 149 
be solved for any other type of motion, specified by some more corh- 
plicated energy operator AK, by the method of the transformation 
theory, without getting back to fundamental matrices (2, p,) and com- 
mutation conditions (which, as has been shown above, are invariant 
with respect to canonical transformations). It is just this method of 
solution which is used by the perturbation theory, when the difference 
between the operators K and // is sufficiently small. 

Besides furnishing a simple and practically the only workable method 
for the solution of such perturbation problems, the transformation 
theory reveals a new connexion between the matrix and the wave-mechani- 
cal method, reducing the latter to a particular case of the former—as was 
pointed out in the preceeding section. We have seen, namely, that 
the characteristic functions or probability amplitudes of the wave- 
mechanical theory o4-,7 can be considered as the transformation coefhi- 
cients from the point of view of the ‘cnergy-trio’ 7] to that of the 
‘coordinate-trio’ x (provided that such a thing as the energy exists, 
i.c. that the energy operator H docs not contain the time)—in the same 
sense as the probability amplitudes a,,;-- are the transformation cocffi- 
cients from the point of view of the energy-trio /Z to that of the cnergy- 
trio A. Vhis means that the wave-mechanical method can be completely 
replaced by the matrix method involving the transformation of the 
matrices F, to the matrices Fj, or vice versa. 

The wave-mechanical theory, considered as a special case of the 
matrix transformation theory, has to solve the following problem: 
Suppose the matrices of all quantities, and in particular of the energy 
H, to be known from the point of view of the coordinates, we have to 
find the matrices representing them from the point of view of H. The 
solution of this problem reduces to the solution of the lincar integral 


equation, [ Uo ede du” eis Il'p?., (126) 
which is obtained from (125) if A is replaced by 7/7, 7 by x, and a,, by 2°, 
and which obviousiy must be equivalent to the Schrédinger equation t 
Hp = HY (by = bri). (126 a) 
The equivalence of these equations can be proved directly with the 
help of the general definition of the elements of a matrix K, by means 
ee uleaneets Pooee = [Whe Fier de". (127) 


+ We mean here and in the sequel Schrédinger’s equation not involving the time (and 
serving to define the stationary states only). This circumstance is indicated by affixing 
to all the quantities connected—directly or indirectly—with the energy operator K the 
additional (upper) index 0. 


150 TRANSFORMATION THEORY §17 
This definition has been used until now only in connexion with such 
‘key’ or ‘basic’ quantities C’, one of which at least could be regarded 
as the energy. This restriction does not seem, however, to be necessary, 
and the formula (127) can be applied to quantities C of any type (pro- 
vided the operators by which they are represented commute with each 
other). We can, in particular, put C = z (i.e. C, = x, C, = y, Cy = 2), 
subject to the condition that the variables zx’ and C’ in ¥2.~ should be 
considered as independent. This means that the two indices (or argu- 
ments) in the function ~°.,. need not necessarily refer to the same point. 

We can thus in (127) put C’ = x” and C” = x”, or, denoting the 
integration variable by xz” instead of x’, write 


Foy = [We Fie da”, (127a) 
where the operator F is understood to refer to the point x”, i.e. to be 
a function of x” and of the clementary operators p,.. = ie ae 

at Ox 


The functions °..., must obviously represent the identical trans- 
formation (from the point of view of x” to that of z’), or, in other words, 
the probability amplitudes that x should be equal to x” when it is 
known that it has the value x’. Since one and the same particle cannot 
be simultaneously in two different places, this means that °..., must 
vanish when z” ~ 2’ and become infinite when x” = z’ (in view of the 
fact that z is a continuous variable). We can thus identify ¥°.... with 
the ‘unit matrix’ of the continuous case, i.e. put 


One == O(x"”—2’). (128) 


This expression can be derived from the general formula 


Pog = f Poon Ayo 4H’ 


[ef. (119), § 15] if we put C’ = 2’ and accordingly aj)... = ale F =: Y%,. 
in conjunction with the orthogonality and normalizing relation 
§ Pong PO dH’ = 8(x"—zx’), the $%., being in this case obviously 
identical with y°..._.. 

It is easy to see that, defined in this way, the function ¢2.... = ~°...,. 


also satisfies the usual orthogonality and normalizing relations: 
| ooo Ponce da” = 8(C’—C"), [$B peer dC’ = 3(x'—2"). 
In fact, putting C’ = x’ and C” = x”, we get, according to (128), 
J Porc Ponce da! = ‘i 5(2’— 2" )8(x""— 2") dx” = 8(2’—2™ 


§ 17 GENERALIZATION OF WAVE MECHANICS 151 
and, putting C’ = x”, 
[es Pee AC’ = { 8(x” —2')8(a”— 2") dx” = 8(x'—2x"). 


We thus see that the elements of a matrix F, can be defined according 
to (127) and ee Nel the integral 


=f 8(a' —x”) F8(x" —x") dx”; (128 a) 

so that, in perks we have 
Ht. = f 8(a’ —2”"H8(x" —x") dx”, (128 b) 
where H denotes the usual Hamiltonian function of the coordinates xz 
and of the ‘components of the momentum’ p, = eee a , both referred 


m 


to the point x = x” (dx” indicates the volume-element enclosing this 


point). 

It can now easily be shown that the integral equation (126), together 
with the expression (128b) for its ‘nucleus’, actually reduces to the 
differential equation (126 a). 

Let us first take that part of H which depends upon the coordinates, 
that is, the potential energy U(z,y,z). We then get, according to (128 a), 


UY.ge = | Ue" )S(a’—2")5(x"—2") da” = Ulz"5(x"—2'), 
* which, on substitution in (126), gives 
[ Usa We de” = Ule' yt. 
Putting, further, F = 0/dx, we have 
Fo | Ba’ —2") <6, Be" —2") de" = — [ "= Bx’ —2") ©, 3a" a") de”, 


since, obviously, 
8(x” —2") a -5 (2 —2"), 


sie inks Sx’ —2") dx” “fo, cin dx". 
Now integrating by parts, we have 
| ve ga8e"—2") dx” = — | 80"— z "ad dx”, 


152 TRANSFORMATION THEORY § 17 
because the product °.8(z”—2”") vanishes at the limits of integration 
(or at infinity). We thus get 


BA AU 
i Fo... dx” = | dx" ale { dix” 8(2' —x")8(x" —2") 
- 


In the same way it can be shown that 


19 


é\2 
i) " PY ae 0 
| va ae ted yi dx = (=) yf. 


2m 2m 6x 
we get f J/°. of de" -= H%. 1t should be mentioned that this formula 
holds identically, i.c. irrespective of the shape of the function 2°. The 
latter is determined in fact as W.,,. by the condition that J/p° should 
be equal to the product 1/4°.. 

The generalization of the matrix theory which has been considered 
hitherto consisted, in the main. in admitting quantities other than the 
energy and those commuting with the energy to the role of the ‘basic 
quantities’ determining the matrix representation of all other quantities 
and being themselves represented by diagonal matrices. In the case 
just considered, this role of basic quantities was switched over to the 
coordinates. The matrices representing the latter 2, (OF Upye Yryss Zrys) 
are obviously defined by the equations [cf. (101 )), § 14] 


, Be ns Ra ‘ . 1 h oa\?. 
if F = (é/er)*, and soon. Putting finally FP = ( — +U-=H 


Vege 2 HS(0" — 2"), (129) 
or, written out in detail: 
U gyal ye > w’8(xx’ —2")8(y' —y")8(2'--2") 
Yo'y'e;2%y"s” = y'd(c' —x")6(y’ —y")8(z'—2”) (129 a) 
fey ix'ys! = 2'3(a’ —2:")8(y’ —y")3(z’ —z") 


The coordinates have, however, preserved at the same time another 
fundamental role in which they have been employed from the very 
beginning—namely, that of the arguments of the functions ¥2 (with 
C == H, x, or any other ‘basic trio’) which can serve for the direct deter- 
mination of the elements of a matrix F, by means of equation (127). This 
second role of the coordinates is intimately connected with the initially 
adopted representation of physical quantities by means of operators, 
defined as functions of the (rectangular) coordinates a, y, z and of the 


$17 GENERALIZATION OF WAVE MECHANICS 153 
d ha h a 


elementary differential operators p ,. == DTS eg ny Dy 
: : sis Pu = Sri oy? > Qari @2? 


Qmrt x’ 
which replace the components of the momentum. 

These functions were supposed to be known, being in fact identified 
with the functions representing the same quantities in the classical 
theory (on the ground that F(z, p,)y reduces to the product F(x,9,)- 
if ~ is replaced by its approximate expression y% = e’?75!", where S is 
the action function of classical mechanics). 

We must now consider a further generalization of the transformation 
theory, consisting in the replacement of the coordinates in this second 
role, connected with the usual operator representation, by some other 
quantitics, e.g. Q, associated with operators which contain derivatives 
with regard to Q. 

The possibility—and, more than that, the necessity—of such a 
generalization clearly follows from the fact that the functions y2,., 
considered as transformation coefficients ‘from the point of view of C 
to that of x’, or as probability amplitudes for one of these two quantities 
having a given value when the value of the other is known, are practi- 
cally symmetrical with regard to both quantities. Instead of—or rather 
together with—the functions ¢°.,., we must consider the functions p%-,") 
which are simply equal to the conjugate complex of the former and 
which correspond to the reciprocal transformation. In these functions, 
however, it is the quantities C which play the role of the coordinates, 
while the latter appear in the role of the ‘basic quantities’ instead of C. 

Replacing the Schridinger wave functions ¥°.,,, by transformation 
coefficients or probability amplitudes of the most general type ag-~, we 
can define the matrix elements of a certain quantity F with respect to 


C by the formulae ; 
Poo = | afc Fag dQ, 
or Fuge = 2, Meer Fag; 


according as C has a continuous or a discrete spectrum. 

This definition will, however, remain meaningless so long as F is not 
specified as an operator ‘from the point of view’ of Q, i.e. as a certain 
function of Q(Q,,Q2,@Q3) and the derivatives 0/2Q. The operators 
which have been considered hitherto have always been specified from 
the point of view of the coordinates z, and obtained from the classical 
functions F(z,g,) by a simple substitution of p, = a - for g,. Adopt- 
ing what can be denoted as the ‘principle of relativity’ with regard to 

8595.6 ; v 


154 TRANSFORMATION THEORY §17 
the ‘basic quantities’ which specify the operator representation, we 
shall denote the operator representing a certain quantity F ‘from the 
point of view of Q’ by Kg), where the brackets are introduced to dis- 
tinguish this operator from the corresponding matrix Fy. The operators 
defined in the usual way, i.e. from the point of view of the coordinates, 
should be denoted accordingly by F/,, and the general definition of the 
elements of the matrix Fj, by means of the operator Fy) should run 
as follows: 


Pree = [abo Re toc- dQ = fasly Rotge dd’ (130) 


if the spectrum of Q is continuous, or 
Feo = 2, Wee Fortec a 2, "9 Fig 4c" (130a) 


if it is discontinuous. 

Another obvious condition for the operators Fig) is that the matrix 
elements of Fy defined by the preceding equations should not depend 
upon the choice of the quantities Q. 

iquations (130) and (130) bear a striking resemblance to the trans- 


formation equations 
Blogs = ( deg: Pyg: Merc: IQ'dQ", 


and Feo- === 2 2 ac'y Fey: &o-c"» 
or, in the abbreviated notation based on the matrix multiplication law, 
Fo = aKa = a'Fya, 


with a denoting the transformation matrix ag-~. 
The equations of both types actually become identical if the operators 


Fig) satisfy the condition 
Fo tg = | Pho-dg-c- dQ", (131) 
or Kyte = » Fea: 4g9-c: (131 a) 
These conditions are a generalization of the equation 
| Hep dx” = Hye, 


which has already been obtained in connexion with the proof of the 
equivalence of the Schrédinger equation (126a) with the integral equa- 
tion (126). It should be observed that, according to the present notation, 
we must write H,,) for the energy operator, and a,,, for the wave 
functions ¥°. Further, we easily get as a generalization of equation 


§ 17 GENERALIZATION OF WAVE MECHANICS 165 
(128 b) the following relation between the operator Ky) and the matrix 


Fr: 
: Fog: = f 8(Q’— 0”) Fig) 8(Q” — Q”) dQ”, (131 b) 


where the functions 6(Q’— Q”) can be considered as the transformation 
coefficients @g9- on the assumption that the spectrum of Q is con- 
tinuous. The formula (131 b) can be considered as the direct consequence 
of (130). 

Putting F = C in (131) and taking into account that 


{ Coro" 2o-c" dQ” —_— Cage 


according to the definition of the transformation coefficients ag-¢. [cf. 
equations (125) and (126)], we get 


Cater = CA gic (132) 


This equation is the broadest generalization of Schrédinger’s equation, 
with C standing for H, Q for x, and the probability amplitudes a9__.- 
(which could also be denoted by %¢-) for the usual ‘wave functions’ 
$°.5,-. If the form of the operator C(g) as a function of Q and of us y 
is known, equation (132) can serve to determine the functions 
dy(= agq-) and the characteristic values C” of the operator Cig) It 
should be remarked that these characteristic values do not depend upon 
the choice of the basic quantities Q (i.e. are invariant with regard to the 
transformation of the latter), being as a matter of fact nothing else but 
the characteristic values of the operator Ci~, or, in other words, the 
(diagonal) elements of the matrix Cy. This corresponds to the physical 
meaning of the characteristic values of a quantity, as the values which 
this quantity can possibly assume, irrespective of the values which can 
be, or actually are, assumed by any other quantities. 

In deriving equation (132), we have assumed that the characteristic 
values of Q constitute a continuous set. If they constitute a discrete 
set, the differential operator representation of different quantities F with 
regard to Q becomes impossible, for the application of the derivative 
operators 0/2Q to functions of Q becomes meaningless. Equation (131 a) 
can hold accordingly only when the operator Kg) reduces to a function of 
Q. The same refers to the equation, F}.9- = en 89:9” Kg) Sqvq7, which 


should replace equation (131b) and which is meaningless, unless the 
operator Fig) reduces to a function of Q (not containing the derivatives 


156 TRANSFORMATION THEORY § 17 
@/2Q), in which case it reduces to 

Fy = FOB yo" 
meaning that Fy is a diagonal matrix. 

This example shows that the matrix theory, which we initially de- 
veloped on the basis of the operator theory, starting with the energy 
operator //,,) and the wave functions defined by it according to Schré- 
dinger’s equation H,,,¥° = H'*®., is actually more general than the 
operator theory even in its generalized form corresponding to the 
replacement of the coordinates x by some other trio of quantities with 
continuously variable values. t 

Another and perhaps logically more satisfactory procedure would be 
to start (following Heisenberg, Jordan, and Dirac) from the other end, 
i.e. with the matrix representation of physical quantities, deriving the 
operator representation as an alternative form of it for the case when 
the basic quantities admit continuously variable values, and using the 
transformation theory for the definition of the probability amplitudes 
Qg-q and, in particular, of the wave functions y°.7, of the de Broglie- 
Schrédinger wave-mechanical theory. 

This purely deductive method has, however, from a didactic point 
of view, the disadvantage of being too abstract and of starting with 
ideas completely alien to customary or ‘classical’ conceptions. The 
inductive method, which is adopted in this book, and which makes an 
appeal not only to the logic but also to the intuition of the reader, 
gradually leading him from the concrete customary conceptions to the 
abstract new ideas, may prove more helpful for those who have to get 
used to these new ideas and perform the logically simple but psycho- 
logically difficult task of getting rid of the old conceptions. 

To this it should be added that the matrix theory remains an empty 
scheme so long as no concrete assumptions are made about the com- 
mutation properties and the functional relationship of the matrices 
concerned, the problem consisting in the actual] determination of the 
elements of these matrices from a certain ‘point of view’ (after which 
a transition to some other point of view and the determination of the 
corresponding probability amplitudes can be made with the help of 
the transformation theory). These assumptions, however, involve con- 
siderations which lic outside the logical realm of the matrix theory and 
can hardly be understood without the fundamental idea of the wave- 


t+ It would be possible to extend the operator theory to the discrete case if differential 
coefficients were replaced by finito diffcrences. 


§17 GENERALIZATION OF WAVE MECHANICS 157 
mechanical theory, namely, that the motion of a particle in a given 
field of force is determined in terms of probabilities by the propagation 
of the associated waves. 

This refers in particular to the commutation relations between the 
fundamental matrices x and p, 


h 
pxr—xp = Ong f°) (133) 
in conjunction with the fact that the latter have to be defined as the 
components of the momentum in the classical expression of the energy 
H (replaced by the matrix H,). 

After these relations, which correspond to the quantum conditions 
of Bohr’s theory, have been established, the whole problem of the 
wave-mechanical theory can be stated as the transformation of all the 
matrices involved (and in the first place of x, p, and //) from the point 
of view of x to that of H, the transformation coefficients °.,,. being 
the probability amplitudes of finding the particle in a given position 
when its energy is known or with a given energy if its position is known. 
The actual solution of this problem is usually reduced to the solution 
of Schrédinger’s equation involving the operator H,,). 

As an illustration of the ‘principle of relativity’ with respect to the 
basic quantities in the operator representation, we shall consider the 
results which are obtained if the coordinates are replaced in this role 
by the momenta p. The latter must be considered in this case as 
ordinary quantities (= Q), while the coordinates, in order that the 
‘quantum conditions’ (133) should be satisfied, must be defined as 
differential operators according to the formulae 


Ho, ee, bee. eas 
272 Op, 27 Op, 2m Op, 


The energy operator H,,) can be determined accordingly as the operator 
resulting from the substitution in the classical Hamiltonian function 
(p2+ p2+ p2)/(2m)+ U(z, y,z) of the clementary operators (133 a) for the 
coordinates. The new wave functions ¢°.,,, corresponding to this defini- 
tion of the energy operator are determined by the differential equation 
[cf. (132)]: 

Ay py = H'p,, (133 b) 
which in general is entirely different from that of Schrédinger—since 
the kinetic energy (p2+-p}-++p?2)/(2m) which in the zx-representation 
reduced to the Laplacian differential operator of the wave theory 


158 TRANSFORMATION THEORY §17 
V2 [multiplied by —h?/(87%m)], in the p-representation remains an 
ordinary quantity, or more exactly an ordinary factor which has to be 
multiplied by the function ¥’,, while the potential energy becomes a 
differential operator acting on this function, the result of the operation 
H,,) being equivalent to the multiplication of ¥?, by a constant factor 
H'—one of the characteristic values of H. As stated above, these 
characteristic values must be the same whether we start with the basic 
quantities x or p. 

The probability amplitudes y°.,, arc, however, in general, functions 
of p’ entirely different from the andingry wave functions ¢°.,,. (with the 
exception of the case of the harmonic oscillator, where the potential 
energy is the same quadratic function of the coordinates as the kinetic 
energy is of the momentum components). According to the funda- 
mental equation of the transformation theory | sce, for instance, (119 b)] 
they must be connected with cach other by the relations 


Pra = | Grp Pon Ip’ 
Pru = [at dae dp’ = | tarde da’ 


where the transformation coefficients a,., can be defined by the operator 


, (134) 


equation, ' 
Paty = P'Ay, (134 a) 
: h oa ‘ 
that is, oni ae te = Prtyyss 
h @ ; 
Dri ay? y's = Py Wey 
h @ = 
2ni ez T2'y PiAzy'z- 
This gives re ile (134) 


px’ denoting the scalar product of the vectors p and r, i.e. the sum 
p,t +pyy'+p,2'. The coefficient 1/vh follows from the orthogonality 
and normalizing relation 

f aepat zp ax’ = &(p'’—p"), or [arpat ayy dp’ = 8(z'—2"). 
The same result is obtained if the functions a, or rather a>1., are 
defined by the operator equation 


—1 — y’q-1 
Tp) Ap = 2a,"; 


§ 17 GENERALIZATION OF WAVE MECHANICS 159 


: a ae 
which, because x,, = 52> ap? Bives 
2a Op 
av, po 1 — em fama p'th | (134 c) 
ve Nh 
in agreement with the relation a7). = at, 


Substituting these expressions in (134), we get 
plQrrp’ ah Day? 
e 2» (leads ” ie al p, we sia dp 


Qm(ps ae py y's PZ Wh dn’ dy’ dy! 
Th =7,| 4 oq CLM PST EPYVA PEE Ty’ dpi dp. (135) 
vh 
0 = 1 f 0 —torp'ae' lh ly! =: ] 0 fone i py’ pps’ yh da'dy'dz' 
Prin = sy | Bee M dU" 2 I Pry e FE PSE A IE OE ee 
Vh. vh 

(135 a) 
The first of these formulae can obviously be regarded as the expansion 
of the function ¥®,, in a Fourier integral with the amplitude coefficients 


] ‘ : ‘cs 
enn while the second gives the explicit expression of these coeffi- 


cients. Remembering the wave-mechanical interpretation of the vector 
p’/h as cqual to the reciprocal of the wave-length and pointing in the 
direction of the propagation of the waves associated with the motion 
of the particle, we can regard the transformation coefficients ayy as 
plane sine waves (without the time factor, however!), and we can inter- 
pret the transformation equation (135) as the representation of the 
wave function ¢,.,, by means of a superposition of plane sine waves 
with appropriate amplitudes and travelling in appropriate directions. 
This physical interpretation is in wae pies harmony with the physical 
meaning of the Fourier amplitudes #?,,,, as the probability amplitudes 
for the particle to have a definite momentum p’ (irrespective of its 
position) for a given value H’ of its total energy to which the function 
p°..7 refers. 

We shall not consider in further detail the generalized transformation 
theory and its application to operators other than x, p, and H. There 
is, however, one particular class of transformations which have been 
alluded to at the end of § 16 as ‘canonical transformations of the second 
kind’ and which deserve special notice. They consist in a transition 
from the original trio of (rectangular) coordinates (x) and the associated 

. =) to some new basic trio of mutually 
271 a 
commuting coordinates (Q) and mutually commuting momenta (P) 


momentum operators (p.= = 


160 TRANSFORMATION THEORY §17 
satisfying the commutation relation 


PQ—QP = § (136) 


for a given motion specified by a definite energy operator 
H,) = H(z, Ys25 2s PyePs) 
which is thereby transformed into H,y) = Mig)(Q), Qo; Oa; Py, Po: Ps). 
The quantities P and Q satisfying the above relations are said to be 


‘canonically conjugate’ with each other. From the point of view of the 
new coordinates (Q) the new momenta (P) are represented by the 


Boo pe > : . 
operators Py) = ua 50 (just as {Me Q’s are represented from the point 
2nt EQ 


h @ 
2ni eP 
representation of the P's from the point of view of the original co- 
ordinates (x) is, however, possible in the particular case only when the 
Q's are defined as certain functions of the x's not involving the p,’s or 
the P's. In this case, which corresponds to the ‘point transformation’ 
of the classical theory, the new momenta (7) can be expressed as certain 
functions of the original ones p, (involving as parameters the co- 
ordinates xz or Q). In the general case of a canonical] transformation 
corresponding to a ‘contact transformation’ of the classical theory such 
a relationship between the new and the old variables does not exist and 
some kind of matrix representation must be used for the definition of 
the latter. The relationship between the new and the old variables can 
be expressed with the help of a certain transformation matrix ® according 


of view of the J’’s by the operators Q:p) = -- ) An operator 


to the equations Q = 0-170, P =: @-1p, 0, 

that is, 
Y, = O-!29, I, = Oy, Q, = O22 o 
P, = O-1p, 9, P,=0-p,0, P,=0-p,® - (136 a) 


These equations automatically secure the fulfilment of the commuta- 
tion relations which must exist between the new variables 


h 
2; O.—-9,.9; = 0, PP,—P,P,=0, P;Q—-Q.P; = oy Dik (136 b) 


as a consequence of those existing between the original ones. 

In order that the new variables should be represented by Hermitian 
matrices just as the original ones, the transformation matrix ® must be 
unitary, i.e. satisfy the relation ®-1 = Of, 

The equations (1364) are thus formally quite similar to the equations 


§17 GENERALIZATION OF WAVE MECHANICS 161 
(124) of § 16. They have, however, an entirely different physical mean- 
ing. While the transformation matrix a in (124) has a mixed character 
referring to two different sets of states, the elements of the matrix © 
refer to the same set of states specified by the characteristic values of 
some basic quantity which serves for the definition of the matrices 
x, p,, Q, P, and H (this basic quantity can in particular coincide with 
the invariable energy #). 

The equations (136a) must be considered as corresponding to the 
classical equations p, = < ung = sed (cf. (31 a), § 4] defining a contact 
transformation with the help of an arbitrary function ®. In the quan- 
tum theory the latter is replaced by the likewise arbitrary transforma- 
tion matrix ®. 

In the classical theory a canonical transformation is characterized by 
the fact that it does not alter the canonical form of the equations of 
motion. The same criterion is easily seen to apply to the canonical 
transformation (136a) of the quantum theory. 

We have, in fact, differentiating Q and P with respect to the time ¢, 


dP 


dQ _ - 
ay 7 [4 @), a 4 PI; 


which in virtue of (136) can be written in the form 


ee aE. ee 
dt oP’ dt aQ 
[ef. § 7, eqs. (43a) and (44c)]. 

An equivalent form of the condition that the variables P and Q 
should be canonically conjugate (in the classical sense), i:e. that they 
should satisfy the canonical equations of motion, is that the Poisson 
bracket expression 

0A 0B @AOB 
(4,5) = > (ap, ae bp 
should be equal to 1 for A = P,, B= Q, (1 = 1, 2,3) and to 0 for all 
the other combinations of the variables P,Q. This condition corre- 
sponds to the commutation conditions (136 b) which can be written in 
the form [Q,, Q,] = 0, [P;, Px] = 0, [P;, Q.] = 54, the classical Poisson 
bracket being the analogue of the quantum bracket expression 


(4, B] = a (4B—BA) (cf. § 8). 


3595.6 Y 


162 TRANSFORMATION THEORY § 18 
18. Geometrical Representation of the Transformation Theory 
The understanding of the generalized matrix theory, connected with 
the ‘principle of relativity’ in the choice of the basic quantities and 
with the transformation from one ‘basis’ to another, can be greatly 
facilitated by the use of a geometrical] picture, or rather of a geometrical 
language, suggested by the formal similarity between the equations of 
the transformation theory developed in the preceding sections and the 
theory of linear orthogonal transformations of ordinary analytical 
geometry. The nucleus of this analogy is that in both cases the trans- 
formation equations are linear (and homogeneous) and that the 
transformation coefficients satisfy similar orthogonality and normalizing 
relations. (The mere idea of ‘orthogonality’ is suggestive of mutually 
perpendicular axes.) 

The choice of the basic quantities in the present theory corresponds 
to the choice of the coordinate system in the geometrical theory, and 
the relativity in the choice of these basic quantities corresponds to the 
relativity in the choice of the coordinate system—or, in other words, 
to the equivalence of all the directions in space. 

It will be remembered that in analytical geometry a linear orthogonal 
transformation means a set of linear homogeneous equations between 
the coordinates x = 2, y = 2, Z = 73 of an arbitrarily chosen point 
with respect to one system of axes, S, say, and the coordinates of the 
same point ¢ = ¢,, 7 = &, ¢ = &, with respect to another system &, 
both systems being orthogonal and having the same origin. These 
equations can be written in the form 

§ ” 2 Any Ly 
= eet (137) 
v 
with a, =a, = o00(4),,£,). (137 a) 
The relations a,, = a;,', which are geometrically evident, can be ob- 
tained analytically from the orthogonality condition 


Lt = DF; (137 b) 
which gives, in conjunction with (137), 
2 Any Anry = Onin, p3 ayn’ arn = Oy'y*s (137 c¢) 


On the other hand, substituting the expressions of the é’s in those of 
the z’s and vice versa, we have 


> any ay = Sint 2 ay} ay = 8 yy (137 d) 
v 


§ 18 GEOMETRICAL REPRESENTATION 163 
The comparison of these equations with the preceding equations leads to 
the relations (137 a), without, of course, the geometrical interpretation 
with which we started. 

The transformation theory which has been developed in the preceding 
sections can be obtained from this elementary theory of linear ortho- 
gonal transformations by a twofold generalization. 

Firstly, by making the number of coordinates specifying a point 
infinite, i.e. by considering, instead of the ordinary three-dimensional 
space, a fictitious space with infinitely many dimensions. 

Secondly, by considering the coordinates of a point as complex 
quantities and by defining the square of its distance from the origin, 
not as the sum of the squares of the coordinates, but as the sum of 
the squares of their moduli, thus replacing the orthogonality condition 
(137 b) by the following condition: 


2 tnt = 2 EE (138) 


the summation being extended over all the coordinates. We get in this 
case, instead of (137¢), 
Zz a," yany = By 2 ayy nen = 85, 

and, since equations (137d) are not Stared 

i= 2, OF Wie a, (138 a) 
that is, a-! = qt, 
In the special case of rea] coordinates z, €, this ‘unitary’ transformation 
reduces to the usual orthogonal transformation (though with an un- 
limited number of variables), and we get at == a* = @ (transposed 
matrix), that is, a~! = @, which is another expression of the relations 
(137a). Although a geometrical interpretation cannot be associated 
with an infinite number of complex variables x, £, connected with each 
other by a unitary transformation, yet, since the number of variables 
does not make any difference from the purely analytical point of view 
(so long as it is larger than 1), we can preserve, if not a geometrical 
picture, at least a geometrical language with respect to the variables 
x, € and the transformation coefficients a,,.. We can accordingly regard 
(or rather denote) the former as the coordinates of a point in a space 
of infinitely many dimensions with respect to two orthogonal systems 
of coordinates S and &, while the latter can still be regarded (or denoted) 
as the cosines of the angles between the old and the new coordinate 
axes. The variables x, and &, can be defined also as the projections 
(or components) of a certain vector r on these axes. 


164 TRANSFORMATION THEORY § 18 

In the simplest matrix transformation problem which was considered 
at the beginning of § 15, the role of the coordinates z, and é, is played 
by the characteristic functions (or rather amplitudes) y¥?,, ¢%.. This is 
clearly seen from the fact that they are transformed according to 
equations (110) and (111) which are the analogues of equations (137), 
and that they satisfy the orthogonality relation (113) which is exactly 
of the same type as (138). We can thus describe the matrix transforma- 
tion theory in a very suggestive geometrical language, according to the 
following principles. 

Each stationary state specified by a wave function ¥%,, can be repre- 
sented geometrically by a certain direction or axis H’ in a space of 
infinitely many dimensions, which we shall call the state-space. The 
states specified by the different functions yf, are represented by axes 
H’ which are perpendicular to each other, the complete set of states 
defined by the operator H forming a complete orthogonal system of 
coordinate axes in the state-space, which we shall also denote by the 
letter H. The ‘completeness’ of the system means that any ‘vector’ in 
the state-space can be represented as the geometrical sum of its com- 
ponents along the axes of H. 

This applies in particular to vectors drawn in the directions of another 
complete orthogonal system of axes K’, which represent geometrically 
the stationary states defined by the operator K. The transformation 
coefficients a,, . can be regarded as the projections of a unit vector in 
the direction of a definite axis K’ on the different axes H’ or, loosely 
speaking, as the cosines of the angles between the axes K’ and H’. The 
latter expression requires, however, a correction, inasmuch as the co- 
efficients aj}, = Qj;x can also pretend to the same role, for they 
represent the projection of a unit vector in the direction of a certain 
axis H’ on the different axes K’. This interpretation of ay, and aj},,, 
immediately follows from the comparison of the transformation equa- 
tions ¢%- = > Oy x Wy and po), = > Ox) 77° $9-- With (137). 

It should be remembered that the quantities ¥%, and ¢{. appearing 
in these equations in the role of rectangular coordinates of a point in 
the state-space are themselves functions of the ordinary spatial co- 
ordinates x, y, z, and that, moreover, they refer to the same (arbitrarily — 
chosen) point. 

So long as this point remains unspecified, p%, and ¢f- can be treated 
as vectors, but as soon as we specify it, putting 7 = x’, we get numbers 
¥°.,, and $2... which, as we know, both with regard to their physical 
meaning (as probability amplitudes) and analytical properties (as trans- 


§ 18 GEOMETRICAL REPRESENTATION 165 
formation coefficients), are wholly similar to the numbers ay... We 
can regard them accordingly as the components of the vectors #3,. and 

« along the axes of a third coordinate system X in the state-space, 
each axis x’ of this system specifying a definite position z == x’, y = y’, 
z == 2’ of the particle in the ordinary space. The axes of this new system 
X must be regarded as orthogonal (i.e. mutually perpendicular) in spite 
of the fact that they correspond not to a discrete set of states, like the 
axes of the system H or K, but to a continuum of states. 

Since the functions ¥°.,,, and ¢°.,.. are normalized to unity, both with 
respect to x and to H or K, the vectors #°,., 6°.., as well as ?., f° (the 
latter specifying a certain position in space irrespective of the values 
of the energy H or K) can be regarded as unit vectors (i.c. having the 
length unity) and the numbers #°.,, and ¢°.,- interpreted geometrically 
in the same way as the numbers a;,-,, namely, as the cosines of the 
angles between the axes x’ (not in the ordinary space of course, but 
in the state-space!) on the one hand, and between the axes //’ or K’ 
on the other. 

From this point of view the transformation equations 

rk! = > Prin UR 
i ail seca (139) 
n= p? or K AK Ip 


acquire an extremely simple geometrical meaning: they become, namely, 
the generalization of the well-known formula of analytical geometry for 
the cosine of the angle between two directions, x’ and K’, say, expressed 
in terms of the cosines of the angles between these directions and a 
complete set of mutually perpendicular directions constituting a co- 
ordinate system //. 

In fact, if we write cos(z’, A’), cos(x’, H’), and cos(H’, k') instead of 
$0, Wy, aNd Az,~ respectively, the first of equations (139) assumes 
the familiar form 


cos(z’, K’) = > cos(x’, H’)cos(H’, K’). 
if 


It becomes, however, necessary to distinguish two different cosines 
between the same two directions (corresponding to the projection of the 
first on the second or the second on the first), since a;!,,, = cos(K’, H’) 
is not equal to a,x = cos(H’,K’) but to its conjugate complex: 
cos(K’, H’) = cos*(H’, K’). (The same refers, of course, to the functions 
Pry and Yifqg) or $y, and g%>?.) 

Following Dirac, we shall often use in future the simplified notation 


166 TRANSFORMATION THEORY § 18 
(K’|H’) and (H’|K’) for these two ‘cosines’ or transformation coeffi- 
cients; we shall write likewise 

or: = ('|K’) Kk’ = (K’|z') 

dew = (2'|H') ri = (H' |x’), 
thus avoiding the unnecessary complications arising from the use of 
different letters, a, ~°, ¢°, etc. The unit vector (in the state-space) 
defining a certain state x’, H’, or K’ per se, i.e. irrespective of the other 
states with which it can be associated, will be denoted accordingly by 
the symbols (x’|), (H’|), (A’|) or (|z’), (|H’), (|K’). This notation has 
the advantage of representing the same thing by the same symbol (or 
two ‘conjugate’ symbols), while in our previous notation the same state 
corresponding to a given position x’ was described by two different 
symbols ¥° or ¢°., depending upon the ‘coordinate system’ H or K 
which we had in mind. 

With the new notation the transformation equations (139) can be 

written in the form 


(x'|K') — > (’|H’)(H'|K’) 
(2'|H’) = D(a’ |K')(K'|H’) J 


& 


(139 a) 


Since the three coordinate systems H, K, and z are equivalent to each 
other, we could write by analogy a third relation of the same form, 


namely, (H'|k’) _ y) (H’ |x’)(x’|K’), 
x 


if x’ were discretely variable, like H’ and A’. Since, however, 2’ is 
continuously variable. we must replace the sum by an integral over 
x’, which gives 

(H'|K‘) = i] (H’ |x")(a'|K') de’, (139b) 


or, in the previous notation, 
Cy >= | PR @)PR(z) ay’, 


which is nothing else but the formula (110a) obtained at the beginning 
of § 15, and again in the way just shown—but without the associated 
geometrical interpretation—somewhat later. 

The preceding equations (139a) and (139b) hold, of course, for any 
three sets of states which may be specified by three basic ‘trios’. It 
should be remembered that, from the physical point of view, they 
express the addition and multiplication law for the probability ampli- 
tudes. The geometrical interpretation of the probability amplitudes 


§ 18 GEOMETRICAL REPRESENTATION 167 
(Q’|C’) as the cosines between the directions Q = Q’ and C = C” in 
the state-space is in perfect harmony with the initial interpretation 
of the orthogonality between two functions representing two different 
states as the expression of the alternative character of these states. All 
those states which are represented by mutually perpendicular directions 
in the state-space are alternative or mutually exclusive—in the sense 
that the probability of finding the particle in one of them when it is 
known to be in another is equal to zero. All such states may always 
be referred to the same set. 

Having elucidated the geometrical meaning of the probability ampli- 
tudes—or transformation matrices—we shall now turn to the geometri- 
cal interpretation of the ordinary matrices, which represent physical 
quantities from one or the other point of view. This interpretation is 
again determined by the transformation equations (121) which show 
that Hermitian matrices can be considered as a generalization of the 
so-called tensors, or more exactly symmetrical tensors, of the elementary 
three-dimensional analytical geometry. 

A tensor can be defined as a composite quantity with a number of 
components, each of which refers to two axes of the same system of 
coordinates, and behaves with respect to a transformation of the co- 
ordinate system in the same way as the product of the components of 
two vectors along the corresponding axcs. 

Let us consider again the two coordinate systems S and = and denote 
the components of the same vector, f, say, along the axes of S and & 
by f,, and f, respectively. If g is some other vector, and if we form 
the products of all the components of f with all the components of g, 
referred to the same system, we shall obtain a set of 9 quantities 


Tin =Sm9n OF fi = fuIvs (140) 
which can be considered as the components of the same tensor T 
referred to, or represented from the point of view of, the coordinate 
system S or &. Taking into account the transformation equations, 


fu = Zomba — So = Duo 
with the coefficients a,,, = a7) = ad tary) a8 ge we get 
Ty = EE Amp tne Tn = SEAT yy (140) 
and / (Se 22 Qin Gm Lay = > 2 Imp 7, ta (140 b) 


These transformation equations can serve to define a tensor 7’ in the 
general case, when its components cannot be put in the simple form 


168 TRANSFORMATION THEORY § 18 
(140). These equations can obviously be written in the following matrix 


form: 
Ts = q-} Ta, Ts = aT za-}, 


which makes it evident that a matrix F, representing some quantity 
F from the point of view of some other basic quantity C, can be inter- 
preted geometrically as a certain tensor F in the state-space referred 
to a system of coordinates whose axes represent the states specified by 
the characteristic values of C. 

The matrices F, representing real quantities are Hermitian, i.e. 
satisfy the relation Pegs Te, 
which can be considered as the gencralization of the condition 

Tw T., Lg eo, 


mn am py vee 
for the symmetrical tensors of ordinary analytical geometry. 
Now such tensors admit of a very simple and suggestive geometrical 
illustration, namely, that of a central quadric (ellipsoid, hyperboloid), 


defined by the equation 
2 pei nntm%, = I, (140c) 


min 


in the coordinate system S, or 


22! Ln (140d) 


in the coordinate system &. 

The fact that these two equations represent the same surface, i.e. that 
the coefficients T,,,, and 7, are transformed into each other according 
to equations (140a) and (140b), can be proved by substituting in 
(140d) the expressions £, = = Onp tm by = p2 @,,%,, Which gives 

2 p3 p2 zn py mp tny tm ty = 1, 


or, changing the sles of summation with regard to the Greek and 
Latin indices, 

YD Antn(Z > Tits Ms) = 1, 

mn BY 


which, in view of (140a), coincides with (140c). 

The components of a symmetrical tensor referred to a system of 
coordinates can thus be interpreted as the coefficients in the equation 
of a certain central quadric referred to the same coordinate system; 
this makes it possible to visualize a symmetrical tensor, without any 
reference to a system of coordinates, as the quadric surface which it defines. 

It should be mentioned that a quadric surface can be defined, accord- 
ing to (140c), by a non-symmetrical tensor just as well as by a sym- 


§18 GEOMETRICAL REPRESENTATION 169 
-metrical one. But it will actually contain the sum of the components 
T,,+T,,, referring to the coordinates x,, and x, as the coefficient of 
their product x,,z,. The asymmetry of 7’, if any, will therefore not be 
manifested in the shape of the surface, or, in other words, the latter 
will define only the symmetrical part of 7’. Thus a tensor can be com- 
pletely specified by a quadric surface only when it is symmetrical. 
Every central surface of the second order has three mutually per- 
pendicular axes of symmetry, which can be defined by the condition 
that, referred to a system of coordinates X whose axes coincide with 
its symmetry axes, the equation of the quadric reduces to the ‘canonical’ 


form . . 
2 Fie f= I, 


not containing products of different coordinates. 

This can be expressed by saying that the matrix 7's considered from 
this point of view is diagonal. The possibility of reducing the equation 
of a central quadric to the canonical form, i.e. the existence of symmetry 
axes, is proved by a well-known method which at the same time leads 
to the actual determination of the cosines between these axes and the 
original axes z,,, i.e. of the coefficients of the orthogonal transformation 
S-> X, and of the diagonal elements of the transformed matrix, or, in 
other words, of the characteristic values of the tensor 7’, a a ja 

‘This method consists in defining the vertices of the quadric—i.e. the 
end-points of the symmetry axes—by either one of the following con- 
ditions: 

(1) The normals to the surface at the vertices coincide in direction 
with the radii vectores from the centre. This condition leads to the 
equations P 


proportional to z,,, 


“ 


m 
where F denotes the left side of equation (140c), or, if the propor- 
tionality factor is denoted by 7”: 
= Lm zp = DD (141) 
n 


So long as we are dealing with ordinary three-dimensional space, this 
is a set of three lincar equations which are compatible with each other 
if their determinant vanishes. The latter condition gives a cubic equa- 
tion for 7”, and to the three roots of it there correspond three sets of 
x, values, x,,, say, which define three mutually perpendicular vectors, 
and reduce to the cosines of the angles between the old axes and the 


symmetry axes if normalized to unity. The three values of 7” turn out 
8595.6 Z 


170 TRANSFORMATION THEORY § 18 
to be the three non-vanishing diagonal elements of the transformed 
matrix or tensor 7's. 

(2) The distances of the vertices from the centre or their squares 
r? _.. ¥ 2, have the largest or smallest possible values, consistent with 
the equation F = YT %_%, = 1. 


mn“m 
mn 


This gives, with the help of Lagrange’s method of undetermined 
multipliers, a system of equations derived from 


$r24-ASF = 0 (141 a) 


by equating to zero the coefficients of the variations of the scparate 
coordinates with a properly chosen valuc of the coefficient A. Putting 
A -: —T’, we again get equations (141). 

It should be mentioned that the variational equation (141a) can be 
interpreted as the condition that F should have a maximum, minimum, 
or stationary value while r? is kept constant, for instance equal to unity. 

(3) Finally we could find the symmetry axes of 7 by defining the 
transformation coefficients a,,, in equations (140a) in such a way that 
the three transformed non-diagonal components of 7 vanish, or, in 
other words, that the transformed matrix 7's be diagonal. This again, 
as can easily be shown, leads to equations (141) or, more exactly, to 


, . 
bs / a TT" =: T’a mt 
nr 


These equations, as well as equations (141), are obviously of the same 
type as equations (122b) or (123) of § 16 defining the transformation 
of the matrix K,, to the diagonal matrix K,;. They only differ in the 
number of dimensions, this being equal to three in the case of ordinary 
space and to infinity in the case of the state-space to which the latter 
equations refer. Another difference between them and the correspond- 
ing elementary equations is that the vectors and tensors with which 
we have to do in the case of the state-space are complex, the symmetry 
condition for the ordinary tensors being replaced by the Hermitian 
condition for the tensors in the state-space. 

With this amendment, which from the purely analytical point of view 
is merely a trivial generalization of the ideas and relations of ordinary 
analytical geometry, we can apply the tensor idea and the idea of a 
quadric central surface in the state-space for the representation of 
physical quantities which have hitherto been represented by Hermitian 
matrices. The idea of a tensor, together with the ‘principle of relativity’ 
in the choice of the coordinate system, is actually equivalent to the 


$18 GEOMETRICAL REPRESENTATION 171 
idea of a matrix in conjunction with the principle of relativity of the 
basic quantities which determine the coordinate system. 

The additional feature of the geometrical representation derived by 
generalizing the ordinary geometrical theory is the possibility of think- 
ing of a quantity F as pictured, as it were, by a central quadric surface 
in the state-space, the axes of symmetry of this surface representing 
the different states specified by the characteristic values of F’, and these 
characteristic values being inversely proportional to the squares of the 
length of these axes drawn from the centre to the vertices (without 
being prolonged to infinity). The latter relation follows from the fact 
that in the canonical form of the equation of the quadric > 7,6 = 1 


the coefficients 7, which are obviously the reciprocals of the squares 
of the lengths of the axes (with positive or negative sign) represent at 
the same time the characteristic values 7’ (or JT’, T”, T”) of the 
tensor 7’. 

The equation of a quadric surface representing in the state-space 
a certain quantity F referred to the syinmetry axes of the quadric 


surface which represents some other quantity, C’, say, can be written 


ak eae TOrmh p ps F°,.¢- at. Ac- = const., (142) 
if the values of C form a discrete set, or in the form 
{{ F8c-a8-ag- dC’dC” = const., (142a) 
if they vary in a continuous manner, while the expression 
E =F ab-ac (142b) 
or k= [ a8-a¢. dC’ (142c¢) 


can be interpreted as the square of the distance from the common 
centre of the two surfaces to some point with the coordinates aq. 

The characteristic values of F and the states specified by them can 
be found by transforming the quadric (142) to the canonical form, i.e. 
to the symmetry axes of F. This problem, as we know already, is 
solved by the transformation equations 


p23 Pevg ao = F'aq: 
aa ‘ (143) 
or | Fvo- ao dC” = F'aq: 


the resulting normalized ag, = a¢-~ = (C"|F’) being the cosines of the 
angles between the symmetry axes of C and those of F, or, from the 


172 TRANSFORMATION THEORY §18 
physical point of view, the probabilities of getting a certain value for 
C when that of F is supposed to be known. 

An important relationship between the two quantities is expressed 
by the coincidence of the symmetry axes of the associated surfaces. 
This means the coincidence of the states specified by the corresponding 
characteristic values of F and C and is equivalent to the condition 
that F and C, defined as matrices or operators from any common point 
of view (Q say), commute with each other. To prove this we shall first 
put Q = C. The matrices F,, and C,., being both diagonal, must com- 
mute with each other, since their product is also a diagonal matrix, 
independent of the order of the factors: 

(FQ)e oe = Foe Core 8erer = (CF eve 

Now when Q +: C one can always define a (unitary) transformation 
matrix b which will transform C' into Q according to the equation 
Q@ = bCb-!. According to the invariance property with regard to 
canonical transformations of this form expressed by equation (12428), 
we must have 

Fy Cg—C og Fy = W(F. Co— Cp, )b-! = 0. 

The transformation equations from C to F in the general case when 
these quantities do not commute can be derived from a variational 
principle of the same type as that which serves to determine the vertices 
of a quadric in ordinary analytical geometry. We can put, namely, 
5H = 0, subject to the condition (142) or (1424) giving 

SF—F'SE = 0, (148 a) 
where F denotes the left-hand side of (142) or (142a) and E the expres- 
sion (142b) or (142c) respectively, while F’ is an undetermined multi- 
plier. This equation can also be interpreted as expressing the fact that 
5F = 0 subject to the condition that F = const. (= 1, say). The 
variations of ag and ag. must be considered as independent of each 
other and their coefficients in (143 a) set equal to zero, which leads to 
the transformation equations (143) and their conjugate complex (i.e. the 
equations of the reciprocal transformation). 

The ‘conditioned’ variational equations 5 = 0 with ¥ = const., or 
SF = 0 with F = const., can be replaced by the ‘unconditioned’ varia- 


tions] equation 8(F/H) = 0 (143) 
which automatically provides for the normalization of the functions 


a 80 far as the value of F is concerned. If, indeed, the ag are not 
normalized, then the functions a¢:/VE# can be considered as their nor- 


§18 GEOMETRICAL REPRESENTATION 173 
malized values and F/E as the value of F subject to the appropriate 
normalization conditions ( > Ug = 1 or f at-ay. dC” == 1). 

C 


It is obvious from the comparison of (143b) with (143) that the 
stationary values of F/E are just equal to the characteristic values F’— 
a fact which can be ascertained directly with the help of the trans- 
formation equations. Taking, for instance, F == ag Roy tes 


m . __ ’ I) _ 3’ NO pk __ yy 
then, since D3 Fug: dg = F'ag, we get F = F p32 A ae/E = F’. 


The variational principle which we have just considered is a generaliza- 
tion of the variational] principle for the energy, which was considered 
in the preceding chapter under the form 8H =: 0, with H =- f p*HypdV 
and FH == § f*/°dV = 1. It reduces to the preceding form if f° is 
replaced by the sum pte P".c, W-~ being the characteristic functions 


of the operator C,,) which may be supposed to represent a Hamiltonian 
slightly different from that represented by the operator H c , more 
exactly, /,,). 

This leads to a problem of the perturbation theory, which, from the 
geometrical point of view, outlined in this section, can be regarded as 
the problem of finding the symmetry axes of the quadric surface H, 
whose equation is referred to the symmetry axes of a slightly different 
quadric C. 

More generally we can say that from this geomctrical point of view 
the quantum mechanics can be regarded as the analytical geometry of 
central quadric surfaces in the state-space; the symmetry axes of each 
such surface specify, by their length, the characteristic values of the 
physical quantity represented by this surface, and, by their direction, 
the associated states; while the cosines between the symmetry axes of 
two different surfaces represent the probability amplitudes for a certain 
value of one quantity (or set of three quantities) when the other 
quantity (or set of three quantities) is known to have a given value. 

In conclusion a few remarks should be added on the question of 
notations. Dirac and following him many other authors denote the 
elements of a matrix Fi. by the symbol (C’|F|C”) which is equivalent 
to the symbol F°,,- used in this chapter, and which has the advantage 
of being closely connected with the symbol (#’|C’) for the probability 
amplitudes ¥%~. Using Dirac’s notation, we can write the transforma- 
tion equations connecting the matrices F,, and F,- in the following 


form: (K' |F|K") ee > >. (K’ \H’)(H’|F|H")(H"\K"), 


174 TRANSFORMATION THEORY §18 
if the spectrum of H is discrete, or 

(A'| FA”) == fl (A'|H1') dH’ (H’|F |") aH” (H"|K"), 
if it is continuous. 

The index ° in our notation serves to indicate that the time, which 
is supposed not to appear in the equations of this chapter, is ignored. 
We shall take it into account in a later section. 

Another remark refers to a type of vector notation applied by Dirac 
to vectors and tensors in the state-space and quite similar to that used 
in the ordinary three-dimensional vector and tensor analysis. 

A state—in the quantum-mechanical sense—is specified by a vector, 
~, say, of unit length and of a definite direction in the state-space. The 
components of this vector with respect to a system of coordinates C’ 
may be denoted by ¥%,-. The same state can, however, be specified by 
the conjugate complex of ¥, which is a vector %* with the components 


Per. 
The sum ¥ $f or the integral f *.—-dC’ which is the measure 
ia 


of the square of the common length of the vectors %* and y% will be 
denoted as their ‘scalar product’ *%. In a similar way the scalar pro- 
duct of two different vectors y%, and , referring to two different states 
will be denoted by yf, or ff ¥., which means, in the coordinate repre- 
sentation, D, oe” Pic OF 2 Yt too (the sums being again replaced by 


integrals in the case of a continuous C-spectrum). 

These expressions (which are conjugate complex with regard to each 
other) can be regarded, from the physical point of view, as the proba- 
bility amplitudes for the simultaneous occurrence of the two states 
(x measure of the ‘mutual compatibility’ of the latter). If these states 
are alternative (mutually exclusive), the vectors ~, and yf, are mutually 
orthogonal, which means that pF yy, = Yip, = 0. 

Further, let F denote a tensor representing not a state, such as ¢, 
but a certain physical quantity (an ‘observable’ or ‘dynamical variable’ 
according to Dirac), with the components F..¢- along the axes (== states) 
of C (we are dropping for convenience the superscript zero). The sum 
oy Foe -o- (or integral f Fuc-%o- dC”) can be considered as the C’- 


component of another vector, ¢, say, specifying some state, in general 

different from ys. This vector will be called the product of the tensor F 

and the vector and denoted by F¢% [so that (Fe)o = > Foeo-del. 
& 


The conjugate complex of ¢ can be defined in a similar way as the 
product of F and * taken in the inverse order, i.e. by the formula 


§ 18 GEOMETRICAL REPRESENTATION 175 
~* — ~*F, which means, in the coordinate representation, 


be = (UP = Soe Fore (or | ote Freee dC"), 
This gives . 


pry — z pi he = > (pF FP) > hey = pp? Pe Fume bo 


¢ 
(or | { ¥8- Fore te dC'de’), 
which will be denoted simply as $* Fy. 
We get further (taking for the sake of simplicity the case of a discrete 
C-spectrum) 


bb — 3 bled = OO Sot Fou Ferber 


C 
or, SINGCC S few yon = (F)oeg, 


we get : f* ph = p* Fp. 

The preceding formula is the simplest example of a ‘tensor product’. 
The product of two tensors F and G taken in the order stated is defined 
as a tensor with the components 


wt Y , 
(F (F) cee = D2 Fore Gog or | Fug Goo dC ‘* 


This definition of tensor multiplication is identical with the definition 
of matrix multiplication if F and G are considered not as tensors but 
as matrices. 

The matrix representation can also be applied to vectors such as 
if we generalize the conception of a matrix by admitting matrices which 
consist not of a square array of numbers (elements, components) but 
of a rectangular array (with a different number of rows and columns) 
and, in particular, of a linear array with one row or one column only. 
If we wish to preserve the general multiplication law, i.e. that the 
product of two matrices shall be a matrix obtained by combining the 
rows of the first factor with the columns of the second, we must repre- 
sent the vector % and its conjugate complex #* by linear matrices of 
different kinds, the one, considered as the first factor, consisting of one 
row only and the second of one column only. 

Taking the components of % and * along the C-axes as the elements 
of the matrices 4, and ¥%, we shall put accordingly 


PE = (WE WE Pr} 
and Po 


ip 
to = ‘a ) 


. 


178 TRANSFORMATION THEORY § 18 
which means that in multiplying two vectors or a vector and a tensor 
we must always start with the conjugate complex (~*,¢*) and finish 
with the original ones. From the matrix point of view we should write 
~* (adjoint matrix) instead of %*, for the matrix ¥* defined above is 
obtained from the matrix not only by taking the conjugate complex 
of its elements, but also by an interchange of the rows and columns (cf. 
§ 16). With this convention the scalar product of two vector-matrices 
y and ¢ can be written in the form ¢'¢ or ¢'¥, while the symbols 
od' or db! have no meaning. Taking the components of ~‘¢ in the 
usual way, we get 


(bb) nn — 2 Pm: berm 


which is equal to zero unless m =: 1 (first row of y') and n == | (first 
column of ¢). 

The product of a vector # and a tensor F’ must be represented 
accordingly in either of the two forms Fy or %'F, the former being 
a matrix of the same form as yw and the latter a matrix of the same 
form as ¥'. The two matrices arc, of course, adjoint with regard to 
each other, so that we can write 

(pt F)' =. Fy, 
which is quite natural since Ft — F (so long as F is a Hermitian 
matrix). 

It should be mentioned finally that the linear matrices with the 
elements ¢,~. = can be replaced by ‘square’ matrices with the ele- 
ments yg. representing a set of vectors, which correspond to different 
values of Q’, or, in other words, the cosines between the directions Q’ 
and C’. Such matrices are not hermitian but unitary, i.e. satisfy the 
relation Yt = 4-1 (Yay = oc). The preceding formulae, relating to 
the products of the type ¢'¢ or Fy, etc., remain valid with this inter- 
pretation of the yf, i.e. not as vectors specifying states, but as cosines 
between two sets of axes specifying two sets of states and measuring 
the probability amplitudes of their coexistence. The transformation 
equations $%&. = > ay.-$%), can be written accordingly in the form 
Po = pa Pom Air, OF d = a (the order of the factors on the right 


side being opposite to that which corresponds to the product of % 
considered as a vector with a matrix representing a tensor). 


V 
PERTURBATION THEORY 


19. Perturbation Theory not involving the Time (Method of 

Stationary States) 

The exact determination of the wave functions ¥°%,,. = (x’|H’) which 
specify the motion of a particle in a complicated field of force is usually 
impossible on account of analytical difficulties. But even if these diffi- 
culties could be overcome, it would hardly be possible to use the results, 
and especially to visualize them, on account of their complicated 
character. Thus both for mathematical and physical reasons it is 
desirable, in the case of a complicated field of force, to use an approxi- 
mative method of determining the functions /°, starting with an exact 
determination of the latter for the motion in a simplified field of force, 
and introducing corrections to represent the effect of the ‘perturbing 
forces’, i.e. those forces which have been left out of account at the 
beginning. 

The energy operator corresponding to the ‘unperturbed’, i.e. simpli- 
fied, motion will be denoted by H (= H,,) and its characteristic func- 
tions by 9, (= 2-4). The energy operator corresponding to the actual 
or ‘perturbed’ motion will be denoted by K (= K,,)) and its charac- 
teristic functions by $%. (= ¢2.x’). 

The difference K—H =: S will thus represent the additional or ‘per- 
turbation’ energy; it is usually defined as the potential energy of the 
perturbing forces. 

This perturbation energy must, of course, be regarded as ‘small’. 
The exact meaning of this condition will become apparent as we develop 
the problem by the method of the perturbation theory. 

As already mentioned, the perturbation theory (so far as H and K 
do not involve the time) amounts to a transformation of all physical 
quantities, considered as matrices, from the point of view of H to the 
point of view of K, which is supposed to-be but slightly different from 
H, so that the actual calculations can be carried out by means of the 
method of successive approximations. 

The principle of this method consists in regarding all quantities in- 
volving S, for instance the matrix elements S,,.,,-, as small quantities 
of the first order and splitting up the exact equations into a chain of 
approximate equations containing small quantities of the same order. 


We shall first assume that H has a discrete spectrum and that the 
3595.6 Aa 


178 PERTURBATION THEORY § 19 
unperturbed motion is not degenerate, the characteristic values of H 
being thus sufficient for the complete specification of the corresponding 
states. 

The fundamental part of our problem will consist in the transforma- 
tion of the matrix K,, to the diagonal form K, and in the determination 
of the transformation matrix a, according to the general equation 


K,=atK,a (at=a-), (144) 
or Kya = aK y, (1444) 
that is [cf. (123), § 16], 

p> Kiva Oye = Kay x. (144 b) 


We must, first of all, fix the ‘zero approximation’ which corresponds 
to S = 0, ie. to the actual coincidence of K and H. Assuming the 
identical states to be labelled by the letters AK or H with the same 
number of dashes (K’ = H’, K” = H", K” = H”, etc.), we can put, in 
this case, 
a= 9, 
that is, Oyen = Byeg, (145) 
where 8 is the mixed unit matrix with the diagonal elements 
Sax: = Sqr = 1 (all the others being equal to zero). 
Equations (144 b) reduce, in this case, to 
Kip = Kaye, 
that is, to Kooy = B’, (145 a) 


which is the same thing as K’ = H’, since Ky x = Ay y = H'. 

We shall now consider the actual case in which S + 0, assuming that 
there still exists in this case a one-to-one correspondence between the 
unperturbed states H’, H’, H”,... and the perturbed states K’, K”, K”.... 
—in the sense that the states labelled by the letter K or H with the 
same number of dashes coincide with each other when the perturbation 
energy S tends to zero. 

We shall put accordingly 


Krx: = K’' = H’+AH’, (146) 
where AH’ denotes the change of the energy-levels due to the perturba- 
BONER py Bie Lite Gipeae Se Byie hairiee, (146) 


the corrections Aa,,,. being assumed to be small (compared with 1). 
We have further 


Ky = H,+S8y, i.e. KY = Beye + Shey (146 b) 


§ 19 PERTURBATION THEORY NOT INVOLVING THE TIME 179 
Substituting these expressions in equations (144b) and taking into 
account that H%,,.- = H’dy-77-, we get 


A’ Brrx + Ady Ry) + p> Sin: Sarge + p> Styne Adyeg 
= (A"+ AH") (85-57 + Ady RK”). 
Since 8,y-,.= 0 unless H” = K” when it is equal to 1, and 


(H” —H’)8 = 0 both when K” = K’ (because then H” = H’) and 
when K” + Kk’, we get 
Strat p> Sty ae AQ yy 
= AH" (yj + Adgege)+(H"—H’ dag g. (147) 
These equations can be solved by successive approximations, if we 
assume that the quantities S,,.z, (i.e. the matrix elements of the per- 
turbation energy ‘from the point of view’ of the unperturbed energy) 


are small quantities of the same (first) order of magnitude and expand 
AH’ and Aa in series of the form 
AH’ = A, A’+A,H’+... 
Aa = A,a+A,a-+... j 
where A,’ and A, @ are corrections of the mth order (that is, of the 
same order of magnitude as the nth power of the elements of S,,). 
Substituting (1474) in (147) and dropping terms of the second and 
higher orders of magnitude, we obtain as a first approximation the 
equations 


(147) 


Soe = Ay AB ge t+(h”"—B')A, Op (148) 
Putting K” = K’ (and consequently H” = H’), we get - 
A, H’ = S%-y. (148 a) 


This formula determines, to the first approximation, the change of the 
energy-levels produced by the perturbation. 

If K” is different from K’ (and consequently H” is different from 
H’), equation (148) reduces to 


Sorgen = (H”—H')A, Ons 


that is, Ay Ogg = — 2H! (148 b) 
giving the first-order expressions for the transformation coefficients 
QaR: 

If we preserve in (147) terms of the second order, dropping terms of 
the third and higher orders, and take account of the first-order equations 


180 PERTURBATION THEORY §19 
(148), we get the second-order equations: 
Pa Sor Ay Aye 

= A, Abr thy AA, Oy gt (A —A' Asay gp. (149) 
It should be remarked that these equations, as well as the equations 
of the succeeding orders, can be obtained from (147) by substituting 
the expressions (147a) and dropping all terms with the exception of 
those of the order in question. 

Putting K” = K’ (and H” = H’) in (149), we get 
p> Stra Ay ay-R = 4, H'+A, HA ay x, 


or, on account of the relation (148 a), 
A, H’ = = Stun Ay AK" 
H°FH 


Substituting the expressions (148b) with A” replaced by K’ and H’ by 
eas weget on > 2 Sirn § — ss [Steal (149 a) 


H’#H’ 
With K” different from K’, equation (149) cis to 


m Mm 
p? Sax Aj Oye KR = A, H A, Qe +(H '— A Ag ayn, 


giving, with the help of (148 a) and (148 b), the following expression for 
the second-order correction in the coefficients a: 


” S® ” S° +ygere S?° seems 
A, ay ne 7 Rls rf a i m 4 ’ 
pop AV) 
S} S2 eoyyeer 
or A, On'K” — meee He H oo m (149 b) 
nec ye WY 


In carrying out the summation over H” we must drop the term H” = H” 
(as well as H” = H’) because the formula 
Sin” 


A, QyeR” =—_—_lc 


holds for the case H” ~ H” only, while for H” = H” we have 
Ay Gyre = 0. (150) 
This equation can be obtained from the normalization condition which 
must be satisfied by the matrix a, namely, 
PtH K: UK = |, 

Putting @3¢-%- = 8y-~%-+Adyg, then since 5,-,- = 1 when H’ = H” 
and 0 when H' + H", we get 

Ady t+ Aazeget p> Ady Aatr re = 0, (150 a) 


§ 19 PERTURBATION THEORY NOT INVOLVING THE TIME 18] 
whence it follows that 

Ay ay-K +A aig = 0. 
Since the diagonal elements of the matrix a must be real (a},-~- = @y-x-), 


we have A 
a epee = 
194 4°K 


The formula (149) likewise leaves undetermined the diagonal elements 
of A.a,x-. They can be determined, however, with the help of the 
equation (150a) or rather the equation 


¢ ak a 
2A Ay-Ke+ > A, A yx Ay Ai = O, 


which is obtained from it as a sccond approximation (dropping all 

terms save those of the second order) and which, in conjunction with 

148 b), gives 

ere Rede e, ea es (150 b) 
2s (H— HH"? 

The formula (150) follows in a quite obvious manner from the geo- 
metrical interpretation of the coefficients a,,,,-- as the cosines of the 
angles between the symmetry axes of the quadric surfaces representing 
(in the state-space) the energy H and the energy A. Since, by defini- 
tion, H and K must differ very little from each other, the corresponding 
axes [/’ and A’ (or H” and A”, etc.) must have approximately the same 
direction, while the non-corresponding axes (//’ and K”) must be nearly 
perpendicular to cach other. Denoting the angle between //’ and K’ 
by aj and considering it as a small quantity of the first order, we get 


peer )® 
ype get = COB Oyy pe == )— Cun) Ace 
which means that the first-order correction A,q@,,,- vanishes, while 
Ag Gy: = —4May-x-)?. Comparing this with : 50b), we can put 
Steal? “nel? 5 
urs? = >) Game (151) 
7H 


This formula shows that the angles between the corresponding sym- 
metry axes of H and XK are of the same order of magnitude as the 
ratios of the matrix elements of the perturbation energy S with respect 
to different H-states to the difference between the characteristic values 
of H for these states. 

The same result, in a still simpler form, is obtained from a considera- 
tion of the first.approximation values of the coefficients a,,,-- == Ady ,- 
(K" #4 K’). Putting ay-5-° = cosayyx- and ayy ~~ = 4r+Aay-~-, where 
Aayx» denotes a small angle, we get 


QyKe = —Sin Dag xe = —Ay ay ge, 


182 PERTURBATION THEORY § 19 
whence, according to (148 b), 
10 
Stra: . (151 a) 


This angle should not be confused with the angle through which the 
axis H” has to be rotated in order to coincide with K” and which is 
equal to ag-~- = Aag-x-. The comparison of equations (151) and 
(151 a) shows that the latter angle can be regarded as the (geometrical) 
sum of mutually perpendicular angular displacements of the type 
Aasy-x- for different values of H’ (+ H"). In other words, the angular 
displacement Aa,,,- can be considered as the component along the 
H'-axis of the elementary rotation a;,-,-. We thus obtain the law of 
the vector composition of elementary rotations about different (mutually 
perpendicular) axes, which is a generalization of the corresponding law 
for ordinary three-dimensional space. 

In the latter case, an infinitesimal rotation of the coordinate system 
can be specified by a certain vector w, which determines the (apparent) 
change of a fixed vector r by means of the formula Ar = —w xr. So 
far as the first approximation is concerned, the components of w and Ar 
along the old and new axes can be identified with each other. Written in 
components along the old axes, the preceding formula gives the fol- 
lowing equations: 


Ax, = £,—2, = —w,%3+w3%, 
Az, = £,—2%, = — W3 %y+ Wy 2g 
Ary = ,—2'3 = —ay%_+ WX 


which can be considered as a particular or rather as a limiting case of 
an orthogonal transformation for the case when the two systems (S and 
=) differ very little from each other. Putting 


Gy = Ugs = — Ugg, We Bey SS Sas Dy = Cg = Ogg 
we can rewrite the preceding equations in the form 
Ady = — > nen Ly: (152) 


Comparing equations (152) with the exact transformation equations 
y= 2 Any Tyr, 
we see that they can be obtained from the latter if we put 
ayy = Sn'n’ — Aninss é, = fn 
where »’ and n’ denote corresponding axes of the new and old system, 
i.e. such axes as were initially coincident. The angles a,.,. = a,» 


must approximately vanish for the normalizing and orthogonality 
relations to be satisfied. 


§19 PERTURBATION THEORY NOT INVOLVING THE TIME 183 

We thus see that an infinitesimal orthogonal transformation in 
ordinary space can be treated as an infinitesimal rotation of the original 
coordinate systems, specified both with regard to the direction of the 
rotation axis and the angle of rotation about it by the (infinitely small) 
vector w with the components w,, ws, w3, or by the ‘antisymmetrical 
tensor’ a with the components «,,,,,- = —-a,-,’, referred to the original 
axes, 

These results can easily be extended to the infinitesimal orthogonal 
transformations in the ‘state-space’, corresponding to a transition from 
the symmetry axes of the quadric surface representing the unperturbed 
energy H, to the symmetry axes of the quadric representing the per- 
turbed energy K = H+S. 

Leaving the perturbation energy S unspecified, we can represent 
the (apparent) change of the components of any vector y due to the 
small rotation of the coordinate axes by an equation wholly similar to 


ibe), name (AU)n = — 3 agen ba (162) 
H 


where « denotes an ‘anti-Hermitian’ tensor (which is a generalization 
of the antisymmetric one) satisfying the condition 
OH = — OT (152 b) 
or at = —a, 
These results can be obtained in the same way as in the three-dimen- 
sional case from the exact transformation equation, 
tx = ee Onn PH 


by putting pe = py tApy and dyig, = 8y-——ay-y, Where the « 
denote smal] quantities of the first order. Substituting the latter 
expressions in the orthogonality and normalizing conditions, 
p2 Byye x Opgenger = 8 
°K! Ty" K’ H°H"’s 
and neglecting second-order terms, we get, if the summation index K’ 
is replaced by H’, 
Pp Syren Oa + pb Onn San = 0, 


that is, Oye zy EO py py == 0, 
which is equivalent to (152b). 
As a matter of fact, from (148b) and because ay-y- = —A,Qyq-x, 
we have S205 
On = Fe py? (152c) 


so that the condition (152b) is actually satisfied. 


184 PERTURBATION THEORY $19 

It should be mentioned that in the case of a generalized space with 
more than three dimensions an antisymmetrical tensor is no longer 
equivalent to a vector.t It is therefore impossible to represent the 
rotation of the quadric surface H into such a position that its axes 
coincide (in direction but not in length!) with those of the quadric K 
by means of a vector corresponding to w, or to specify the rotation 
by its components along the different axes of H. Instead of using 
the coordinate axes, we can, however, use for the same purpose the 
coordinate planes (in the case of ordinary space the number of these 
planes is equal to the number of axes, which explains the possibility 
of representing the former by the latter). The quantities «,,-;,, can be 
interpreted as the projections of the rotation H > K on the planes 
(H”,H’). The angle through which H” must be rotated to coincide with 
K’” is given by the equation 

onreK: = p2 lorem |*, 

which is similar to the ordinary equation for the composition of ele- 
mentary rotations considered as vectors (for instance, w* = w{+w?+w?) 
because in the preceding equation one of the axes (H”) remains fixed 
and the summation over the different planes passing through it is 
equivalent to a summation over all the axes different from H”. 

The expressions (152c) for the elementary rotations, as well as 
the corresponding (first-order) corrections for the energy values 
AH’ = K’—H’, can be obtained in a somewhat simpler way than before 
by starting from the expressions (152a) and using the equations 
A} y = H'by and Ko, = K’$x.. 

Putting in the latter equation dx = fy +Ayy, K’ = H’+AH’, and 
K = H-+S, we have 
Abi t+ Spy t+ Hay t+ Sab = A'by +O fy +H’ Apby +A Apy, 
or dropping terms of the second order of smallness (i.e. the products 
SAy,, and AH’Ay’): 

Shy +HAdy = OA py +H dpyy. (153) 
Now by the definition of matrix elements we have 
Soy = » Soren WH 
On the other hand we get, according to (152a), 
HA y = — p> nen AY ye = — > anew H"by-. 


t If n is the number of dimensions, then the number of different non-vanishing 
components of an antisymmetric tensor is equal to 3n(n—1), which is equal to the 
number (n) of components of a vector only when n = 3. 


$19 PERTURBATION THEORY NOT INVOLVING THE TIME 185 
Thus (153) can be written in the form 


p2 (Sin — A cp ye == AH by —H’ 2 orn Pa 
or 
p> [Stee — (HB Jog We = App = ~ Sy AHP. (153.8) 
Equating the coefficients of b,,- on both sides, we get 


10 ae , 
deen \, (153 b) 
fen? == (A — HB’ ogy yy: 


in agreement with the results previously found. 

The fact that equation (153a) splits up into equations (153b) for 
the coefficients of the separate y,,.. is due, as already pointed out, 
(Part I, § 18), to the mutual orthogonality of the functions y,- (as 
functions of the coordinates z, y,z). If we have an equation of the type 
font = 3 bir which holds identically (i.e. for all values of 


x,y, z), then multiplying it by 7,- and integrating over 2, y, z, we get 
Q4,- = b;,-, all the other terms vanishing. 

We have assumed, hitherto, that the unperturbed problem was ‘non- 
degenerate’, i.e. that all the characteristic values of H were different. 
The essential character of this assumption is clearly seen from the fact 
that the equations @),-3,, = Eg become meaningless (unless S9,.;,, 
vanishes) when H” == H’, while the two states ¥,,- and yy remain 
different. It is, moreover, impossible to specify the different states, as 
has been done so far, by the value of the energy alone. We shall there- 
fore add to it some other quantity C, which commutes with it (i.e. 
represents a constant of the motion) and which can be supposed to have 
different values for ditferent states which have the same energy. 

The alterations in the treatment of a perturbation problem which 
are necessitated by the presence of degeneracy in the unperturbed 
problem can best be understood with the help of the geometrical inter- 
pretation. If the energy H is represented as a quadric surface in the 
state-space, with symmetry axes whose lengths are inversely proportional 
to the corresponding characteristic values of H, then degeneracy means 
that a few of these axes have the same length, the corresponding section 
of the surface, comprising all the equal axes, being ‘circle-like’ A de- 
generacy of this sort is met with in ordinary analytical geometry in the 
case of an ellipsoid with two or three equal axes, the ellipsoid degenerat- 


ing into a spheroid or into a sphere. 
3595.6 Bb 


186 PERTURBATION THEORY § 19 

So long as the surface is not degenerate, the directions of its symmetry 
axes are perfectly definite. Degeneracy involves an arbitrariness in the 
choice of the symmetry axes within the ‘circle-like’ section, any ortho- 
gonal system of axes being appropriate. It may be mentioned that this 
corresponds to the physical indeterminateness of the corresponding 
states and to the necessity of specifying them with the help of some 
other quantity, C say, which can also be imagined to be represented 
by a certain quadric surface. The commutability of H and C means, 
as we know, that the symmetry axes of the corresponding surfaces have 
the same directions; if one of them has a ‘circular’ section its axes 
within this section can be identified with those of the other. 

Let us assume that the surface representing the energy K of the 
perturbed motion is non-degenerate. We shall then find two types of 
relations between its symmetry axes and those of H. So long as the 
latter are intrinsically determined—i.e. apart from the circular sections 
—the axes K’ must differ but very little from the corresponding axes 
H', as has been supposed hitherto. So far, however, as a set of equal 
H-axes is concerned, a set contained within a circular section and fixed 
more or less arbitrarily, the angles between them and the set of K-axes 
corresponding to this section need not be small. The process of successive 
approximations, which was based on the assumption that all the angles 
Oy” were small, must therefore, in general, lead to wrong results. 
That it does lead to wrong results is clear from the formula (152c) 
which gives an infinitely large value for «,,-,, if the difference H”—H’ 
(for two different states) vanishes, unless S°,.,,, also vanishes. 

It is thus clear that before starting on the process of successive 
approximations based upon the assumption of the smallness of the 
angles, one must make them actually small by transforming the sets 
of axes which refer to ‘circular’ sections in such a way that they 
approximately coincide with the corresponding set of K-axes. | This 
‘preliminary’ or zero-order transformation can be carried out for each 
circular section independently, i.e. by dropping from the general equa- 
tion of the K-quadric, or rather from the equations of the Ky, > Kx 
transformation, all the terms which connect different circular sections 
with each other (or with individual axes, if any). In fact the trans- 
formation coefficients az, and a@;,-,-,, where H’ and H” refer to one 
circular section and K” to another ‘nearly’ circular section, must be 
very small of the first order (the two sections being ‘nearly’ perpendi- 
cular to each other) and can therefore be neglected compared with the 
coefficients @;,z- OF @7-;%-, where K” refers to the nearly circular section 


§ 19 PERTURBATION THEORY NOT INVOLVING THE TIME 187 
of K which approximately coincides with the circular seetion of H 
containing the axcs H’ and H”. 

It will be convenient to alter our previous notation and to denote 
the r’ axes of a circular section corresponding to the value H = H’ by 
Ci, Cy,...,C). The r’ axes of the corresponding nearly circular section 
of K will be denoted accordingly by K;, Kj,..., Kj, There is, in general, 
no one-to-one correspondence between these r’ K’-axes and the r’ 
C’-axes. They form two different orthogonal systems and the pre- 
liminary transformation which we are looking for is precisely the 
transformation C’ + K’ carried out for each circular section separately. 

The exact equations of the transformation H > K are thus split up 
into a set of ‘zero-order’ equations of the following form: 


Z, Ken cgte, c= Kags (154) 
n 


where m == 1, 2, 3,...,7’. 

For each of the ‘multiple’ values of / corresponding to r’ different 
states, we thus get a system of r’ linear homogeneous equations involving 
states of this set only. These equations are quite similar to the general 
transformation equations for the case of no degeneracy, 


2 Kivu tyeK = Kay Ky, 

differing from them solely by the fact that they refer to a finite number 
of states—a fact which makes it possible to solve them exactly without 
the use of the method of successive approximation (whose application 
has to be postponed). 

Putting K=H+S and K’'=H'+AH’ in (154), then since 
Hoc = A'8nn» we get 
ee (1542) 

1 


For the sake of simplicity, we shall rewrite this equation, or rather 
the set of r’ equations, in the form 


> Nova a, = AH’a,,, (154 b) 
n=1 
where m is an abbreviation for C,, and the index K’ is dropped. Their 
compatibility condition 
S°,— AH’ S?, oar. S?.. 
a, S9.—-AH’ . . . S8,. =, (1540) 


Sr Shp + ++ Shy AH’ 


188 PERTURBATION THEORY § 19 
gives r’ values for the ‘additional’ energy AH’, which are, in general, 
different from each other. This is expressed by saying that the per- 
turbation splits up each multiple energy-level H’ into a number (r’) of 
different sub-levels K’ = H’+AH}, H’+-Adj,..., H'+AH}. 

To each value of AH’, AH; say, there corresponds a set of values 
of the 7’ coefficients a,,: 

oe Bag, Wes, sip Ve, 

As in the general case, each of these sets must be normalized to 1, the 
different sets being orthogonal to each other. We thus get for cach 
r’-fold value of the unperturbed energy H’ a unitary transformation 
matrix a of order r’, which serves to transform the original r’ functions 
tors Bors +» Wor, associated with the energy-level /J’ into new functions 
bees Urs) Pe, associated with the different sub-levels into which 
these levels are split up. Using the one-row matrix notation for the 
two sets of functions,we can write the relation between them in the 
form y=ap or pt = ptat. 

The preceding results are identical with those obtained in Chap. IT, § 9, 
by means of the variational method. 

It should be understood that the functions %’ do not represent a sct 
of K-states, but another degenerate set of H-states which only approxi- 
mate to the corresponding K-states. Starting with these functions, it 
is possible, in the usual way, to obtain higher approximations. It is 
important to note that the first approximation values for the energy 
are determined, according to (154c), in conjunction with the ‘zero 
approximation’ for the characteristic functions. 

It can easily be shown that the H-states specified by the new func- 
tions y’ are such that the matrix of the perturbation energy S with 
respect to them is diagonal. This follows from the fact that equations 
(154 a) are of the same form as the equations for the transformation of 
the matrix K,, to the diagonal form K,, K being replaced by S, K’ by 
AH’, and the whole quadric K by its ‘nearly circular’ section. Denoting 
the transformed matrix of the perturbation energy (for the r’ states yb’) 
by S’, we have S’ = aSa = atSa. 

The diagonal elements of S’ are equal to the values of Ali’ for the 
corresponding states, so that we can put 

Sk, x, = AH;, 
which is exactly of the same form as equation (148 a), referring to the 
case in which there is no degeneracy. 


§ 19 PERTURBATION THEORY NOT INVOLVING THE TIME 189 
These equations have a very simple physical meaning, which can 
be expressed by saying that the additional energy due to perturbing 
forces is equal, in the first approximation, to the average value of the 
perturbation energy S for the unperturbed motion.t When there is no 
degeneracy, the latter is specified unambiguously by a function w,,' 
referring to one definite state. In the presence of degeneracy these 
unperturbed states have to be defined by means of the preliminary 
transformation, and are, in general, different from the original states. 

We are now in a position to formulate the conditions under which 
a perturbation can be treated as weak. This weakness must obviously 
correspond to the smallness of the angles between the symmetry axes 
of the surfaces A and H and also to a smallness of the difference 
between the lengths of these axes. The ‘circular’ sections of // corre- 
sponding to degeneracy need not be taken into account, since the 
directions of the axes lying within them remains arbitrary and can 
always be adjusted to be close to those of the corresponding section 
of K. 

Now we have secn that, to a first approximation, the angles a,;,,-- are 
equal to S¥,1,-/(H’—H") and the differences K,-.-— Hyp yy =: K'—H’' 
are equal to S9,.7,. 1t follows from this that the perturbation can be 
considered as weak if the matrix ciements of the perturbation energy 
S with respect to different values of H are small compared with the 
difference between these values, and the diagonal elements are small 
compared with the corresponding values of H. 

The smallness of S in this sense does not exclude the possibility that 
S, considered as a function of the coordinates of the particle (i.e. in 
the classical sense), should become very large and even infinite at certain 
points or regions. This makes the range of applicability of the wave- 
mechanical perturbation theory infinitely broader than that of the 
classical mechanics, which is restricted by the condition that S should 
be small compared with H’ at all points of the unperturbed path. 


20. Extension of the Preceding Theory to the Case of ‘Relative 
Degeneracy’ and Continuous Spectra; Effect of Perturbation 
on Various Physical Quantities. 

In many non-degenerate problems we meet with the case of a perturba- 

tion which cannot be described as weak—in the above sense—with 
¢t It should be mentioned that the same result holds in the perturbation theory of 


classical mechanics, the average value of S being defined here as the average value with 
respect to the time. 


190 PERTURBATION THEORY § 20 
regard to pairs of (unperturbed) states belonging to certain sets, while 
it remains weak with regard to pairs of states belonging to different 
sets. This means that the matrix elements of S with respect to the 
different states of the same set are large—or at least not small—com- 
pared with the energy differences between these states, while the matrix 
elements of S with respect to states belonging to any two different sets 
are small compared with the corresponding energy differences. In the 
limiting case when the energy differences between the states of the same 
set vanish, we get back to the ‘degenerate’ problem considered before. 
It is plain, however, that the same method can be applied approxi- 
mately when these energy-differences do not exactly vanish but are 
small compared with the corresponding matrix elements of S, so that 
without sensible error the (unperturbed) energies of the states in ques- 
tion can be identified with each other. 

This serves to show that the notion of ‘degeneracy’ can be visualized 
as a relative one, from the point of view of the perturbation energy 
S which we are interested in, the ‘absolute’ degeneracy which has been 
considered hitherto forming but the limiting case of this relative 
degeneracy. If, for instance, S contains a continuously variable para- 
meter (an electric or magnetic field, say), we can pass, by steadily 
increasing it, from a practically non-degenerate problem to a practically 
degenerate one, the degeneracy extending over certain sets of states 
whose energy-differences become small, as S increases, with respect to 
the corresponding matrix elements of S, while the matrix elements of 
the same function remain small compared with the energy-differences 
between states of different sets.t 

We shall assume that such a subdivision of the various unperturbed 
states into relatively narrow sets, which lie wide apart from each other 
on the energy scale, is possible, and shall denote these states as multi- 
plets. When the perturbation energy (defined by the value of its matrix 
elements with respect to the corresponding states) is small compared 
with the distance between the different multiplets and not small 
(without necessarily being large) compared with the ‘widths’ of the 
separate multiplets, the perturbation theory given in the preceding 
section is no longer applicable, and must be replaced by a more general 
method. 

This generalized perturbation method (which has been pointed out 
by Lennard-Jones and by Jones) is extremely simple and consists in 


+ Atypical example of this condition is found in the transition from @ weak to a strong 
magnetic field in the theory of the Zeeman effect (or Paschen-Back effect). 


§ 20 RELATIVE DEGENERACY AND CONTINUOUS SPECTRA 191 
splitting up the exact system of the transformation equations 


0 eae mM 
Pp Koy ye Qype — K Oy RK” 


into a number of approximate systems, referring to the separate multi- 
plets and obtained from the above equations by confining the summation 
over H” for each value of H’' to such states only as belong to the same 
multiplet as H’. 

This is exactly what we have done before in writing down the equa- 
tions (154) which refer to the limiting case of absolute degeneracy. 
They are applicable, however, just as well to the more general case of 
a relative degeneracy if the letters C;, C;,..., Ci. are used to denote the 
states of the same ‘multiplet’, with energy-values H}, Hj,..., H} lying 
close to a certain value H’ and far away from the energy values, speci- 
fying all the other unperturbed states. To prove this we need but 
note the fact that the matrix elements of the total energy K with 
respect to states of different sets are relatively small and can there- 
fore be neglected compared with those which refer to the same set 
(multiplet). 

In the geometrical representation of the unperturbed and the per- 
turbed states as the axes of the quadric surfaces H and K in the state- 
space, a multiplet corresponds to a ‘nearly’ circular section of the 
former. So long as each such section is nearly parallel to a certain also 
nearly circular section of the K-surface, we have to deal with a per- 
turbation which can be considered as weak with regard to the different 
multiplets. It can be, however, at the same time strong with regard to 
the states of the same multiplet, if the symmetry axes of the corre- 
sponding nearly circular sections of H and K have entirely different 
directions. A one-to-one correspondence between the unperturbed 
states of each multiplet and the perturbed ones cannot be traced in this 
case, just as in the case of an absolute degeneracy. The difference 
between the two cases lies only in the fact that in the former case the 
unperturbed states are fixed unambiguously, while in the latter they 
are represented by a perfectly arbitrary set of mutually perpendicular 
axes in the corresponding exactly circular section of the quadric H. 

As has just been mentioned, the equations (154) still hold for the 
case of the ‘relative degeneracy’ if the letters C},..., CL serve to distinguish 
the states of a multiplet belonging to neighbouring values of the energy 
Hj,..., HL. The equations (154a) or (154b) are, however, not applicable 
to the general case, for we must take into account the differences 
between the various ‘sub-levels’ H;, (n = 1,...,r’). To do this we need 


192 PERTURBATION THEORY § 20 
only replace AH’ in (154a) by AH}, = K’'—H,,, which gives, in the 
notation of (154b), 


$ Sin On = AH, 4, (155) 
n=1 
r rs = 
or ps Sinn 2n on (QH’—AH,;, an, (155 a) 
n=1 
where AH’ = K'—H' and AH’, = H’—Hi,, 


H’ denoting some average of the r’ values H{, /1;,..., H,. The com- 
patibility condition of the equations (155 a) 


89, ++AH;—AH’ Se, - Sty 
89, S2.-+AHi,—AH’. . 8? = 0 (155b) 
S°, SP. . os S?,,+AH;),—AH’ 


differs from (154c) by the additional terms AH’, in the diagonal ele- 
ments of the determinant, and leads as before to 7’ (in general different) 
values of the perturbed energy K’ = H’+AH’. If the non-diagonal 
terms of the determinant are sufficiently small it reduces to the product 
of the diagonal terms leading to the expressions AH’ = S°,4+AH* or 
AH,, = S%, which have been obtained in the preceding section for the 
case of no degeneracy. If, on the contrary, the terms AH’, or rather 
H,,—H,, are small compared with S°,,, equation (155b) practically 
reduces to the equation (154c) for the case of complete (absolute) 
degeneracy. 

We have hitherto assumed that the wave functions ~,, specifying 
the unperturbed states are orthogonal with respect to each other. The 
above theory can easily be extended to the case when the orthogonality 
condition is not fulfilled. We need not, however, consider this case in 
detail here, for it has been dealt with already in § 9 of Chap. II by the 
variational method. The results embodied in the equations (61) are 
a generalization of the equations (154), which differ from the (special- 
ized) equations (62) in the notation only. 

It should be mentioned that to the states defined by non-orthogonal 
wave functions there correspond in the state-space a system of non- 
orthogonal axes to which the energy quadrics H and K are referred. 
The non-orthogonality of these axes means physically that the corre- 
sponding states are not mutually excluded, the integral f Pf-p.,.dV 
measuring in fact the probability of one of them when the other is 
supposed to be realized. 

So far we have dealt only with the case in which the unperturbed 


§ 20 RELATIVE DEGENERACY AND CONTINUOUS SPECTRA 193 
motion has a discrete energy spectrum (which corresponds, classically, to 
its being confined to a limited region of space). The case of a continuous 
H-spectrum could be treated on similar lines. It is, however, meaning- 
less to determine the change AH of the energy-levels produced by the 
perturbation, when these levels form a continuous series. Thus one of 
the main problems of the perturbation theory relating to the case of 
discrete H-spectra, together with the complications arising in connexion 
with degeneracy, drops out. The other problem—that of the deter- 
mination of the change Ay of the wave functions specifying the 
stationary states—can be solved in the same way as before, i.e. by 
determining the transformation coefficients a,,.;-. In the present case 
the zero approximation is given by the formula 
Oy = 6(H'—H"), 
instead of Qy-,- = 5yq-. Instead of equation (144b), we have 
[ Koes Qyeg GH" = K” dyp gm. 


Putting a,-, == 8(H4”—H”)+Aay-g~ and K == H+S, then since 
Spy = H'3(H'—H’"), 
we get 
H'[8(H!—H") + dag ge} + Stee + | Sint AQ yen dH" 
= K"[3(H'—H")+Aay-g-), 
which, with A” = H”+AH”, can be written in the form 
Stent Sin Daye x dH" 
= AH"[5(H'—H”)+ Aay p+ (A"—’ Ady x. 
This method can be conveniently applied only when the quantities 
Aaj, are known to be small—a condition which is, in general, not 


satisfied. 
An alternative method consists in the direct determination of the 


change of the functions ¥,,,, Ay, which is produced by the perturba- 
tion, without the use of the integral representation 


Ady = { Magy Py 4H" 
(where Aay-° = A@yx-). This can be done with the help of the 


ao (H+8—K’\by + Abn) = 0, 
which can be written in the form 
(H—K')Ayp = —S(by+Ady) (156) 


and which differs from the approximate cquation (153) by Ilcaving 
3595.6 ce 


194 PERTURBATION THEORY § 20 
‘unsplit’ the energy K’ of the perturbed motion and by preserving the 
small term SAj’. Dropping it, we get the equation of the first approxi- 
mation: (H—K')A,p = —Sy- (156 a) 
Substituting on the right side the mth-order correction A, yp, for pz. 
we get the equation for the correction of the (n+ 1)th order, 


(H—K’ bau = —SAnu (156 b) 
the exact function 4,,+Ay,- being thus defined as the limit of the 
series . ba tO bat Meda t-.. 


This method has been worked out by Born in connexion with collision 
problems (see Part III). It can be applied also to the case of 
discrete spectra (thus enabling one to avoid the determination of the 
transformation coefficients a); but in this case it must be modified by 
putting K’ = H’'+AH’ = H’+A,H’+A,H’'+..., which leads to the 
equations 

(H—H')A, by = —(S—A, Hy 

(H—H')Ag by = —(S—A, H')Ay y+ (A, Hy: 


G. wea ee . (157) 
(H—F’')A, by 
an es (S—A, H')A,, Pat (A, H')A,, -1 Pyet ie (4,., 1 A’) by) 
The problem becomes more complicated, for we must determine not 
only the functions A, #1, A, #;,, etc., but at the same time the numbers 
A, H’, A, H’,.... This can be done with the help of the so-called ortho- 
gonality property of the non-homogeneous linear equations of the form 
(H—H’)y = f. (157 8) 
This ‘orthogonality’ consists in the following: Multiplying the preceding 
equation by the solution of the corresponding homogeneous equation 
(H—H')$,, = 0, or its conjugate complex y¥j,, and integrating, we get, 
in view of the self-adjointness of the operator H, 


| ¥a(H—H))x dV = [ (HH Wi dV = 0, 
and consequently f Sbiy V = 0. (157 b) 
Applying this ‘orthogonality property’ to the first of equations (157), we 
ai AH | tarde dV = | di Star AV, 
that is, A, H’ = S%,.,,. Applying it to the second, we get in a similar 
me Ay! = | oR(S—A, HVA dy dV, 


§ 20 RELATIVE DEGENERACY AND CONTINUOUS SPECTRA 195 
which can easily be evaluated after A, ,, has been determined from the 
first of equations (157). This process can be prolonged as far as one 
may desire, the determination of A, H’ always preceding by one step 
that of A, by. 

If (157 a) is multiplied by /},. instead of ,,-, we obtain, on integration, 


—H') | bie x dV = [ otf av. (157 ¢) 
This gives, if applied to the first of equations (157), 


0 , 
[ oe Aiba dV = — ie, 


i.e. the expression for the coefficient A,a,-,-. This is quite natural, for 
if we put A, %,, = 2 Ar tne bar then, in view of the orthogonality 


of the functions ¥,, and y-, we get { J#-A, py dV = Ayay-g. 

The preceding results obviously hold for the case only when the 
unperturbed problem is not degenerate, and must be modified:if there 
is degeneracy—either absolute or relative. 

We shall, however, leave that case aside and shall briefly examine 
the approximate effect produced by the perturbation on any physical 
quantity F described as a matrix, from the point of view of H in the 
case of the unperturbed motion and that of K in that of the perturbed 
one. This can be readily done after we have succeeded in determining 
the supposedly small quantities Aa;,,. or Ay. Putting 

Plex —Piene = MFiyne 
we have AFP Opa = (at Fa— FY) oy, 
or, since a — §+Aa and 8tF = F3 = F, 

AFS oye = (FAa-+ Aa' F)9 47+ (Aa! FAa)o,.p-. 
This gives, to the first order of approximation, 
Ay Foye = (FA, a+A, at P)op, 

or in the case of a discrete H-spectrum (with no degeneracy or a de- 
generacy accounted for by a preliminary transformation), according to 


(148 b): . 

as Fler = > ie as + > Ss ae =r we, (158) 
. Stn oy 
since A, @} ppg == Ay Qf = —Ay Aye = WH om Putting H” = H 


and writing H” for H” we obtain, in particular, 


Ph vee Stee gr B types Fteaze 
A, Pips =) eee ee ie (158.2) 
Pa 


196 PERTURBATION THEORY § 20 
This formula determines the change of the average or probable values 
of F for the different unperturbed states as compared with the corre- 
sponding perturbed states. Putting F = S, we get 


Ay Sy =o re sas 


Comparing this with (149 a), we avers the eis relation between the 
second-order correction for the energy and the first-order correction 


for S%,-—: A, H’ = 4A, 88x. (158 b) 
This formula is quite similar to 
A, H' = 8%, 


and can be further generalized with the result 
Roe 0 
A, H = ond Sir 


if higher-order corrections for the matrix elements are taken into con- 
sideration, according to (157a). We shall not, however, consider in 
detail this question which can easily be solved by substituting in (157 a) 
the expressions Aa = A,a+A,a+.... 

For the sake of illustration we shall apply the preceding equations 
to the case of a hydrogen-like atom, perturbed by a homogeneous 
electric field EH parallel to the z-axis. We have in this case S = —ekz, 
where z is the coordinate of the electron with respect to the nucleus. 
Putting in (158a) F = ez, we obtain the expression for the additional 
electric moment induced by the field when the atom is supposed to 
remain in the (non-degenerate) unperturbed state H’: 


2 
CL yy = 2H > = = aH, (158c) 


where a is the polarization (or susceptibility) coefficient. The corre- 
sponding energy must obviously be equal to 40H? = 3A, S},-4,, which 
is in agreement with the relation (158b) since the energy in question 
corresponds to the second-order correction (A, H’). 

The same results are obtained, of course, if instead of the transforma- 
tion coefficierts the transformed functions y, or rather the corrections 
Ay, are used. Limiting ourselves to the first approximation, we get 


Phx = | Whe +Abh)F Wiae + Ada) dV 
= [oie Fda dV + f bh Pb WV + | dhe POs dar dV, 
that is, A, F&-pe = i} Av Fue dV + [oh Fada dV. (159) 


§ 20 RELATIVE DEGENERACY AND CONTINUOUS SPECTRA 197 
These expressions can be used in the case of continuous H-spectra when 
the functions A, p,, are determined directly by Born’s method. If they 
are determined with the help of the transformation coefficients, we get, 
as before, A, F'?,.q7- = (FA,a+A,a*F)?9,.;,-, which means in the present 
case 
A, Peg = { (Sem: Shegar 4 ge hem) dH" (1598) 
instead of (158). 

In conclusion the following remark should be made. It can happen 
that, while the unperturbed motion is confined to a finite region and 
has accordingly, within a certain interval of energy values, a discrete 
spectrum, the perturbed motion has, within the same interval, a con- 
tinuous energy spectrum, which means that the perturbing forces, even 
when small, can extract the particle and drive it to infinity. An example 
of this condition is furnished by the action of a homogeneous electric 
field on a hydrogen atom. In the region of low energy values the con- 
tinuous energy spectrum, corresponding to the presence of the electric 
field, practically reduces to u discrete one, with each H-level split up 
(us a consequence of degeneracy) into several sub-levels. This pheno- 
menon is known as the Stark effect. The sub-levels in question have, 
however, a certain effective width which increases with the strength of 
the electric field and which corresponds to the phenomenon of pre- 
dissociation, discussed in Part I, § 16. This means that there exists a 
certain probability for the atom to be ionized by the electric field even 
if the unperturbed state of the atom corresponds to the lowest energy. 
The width of the energy-levels becomes, however, marked for unper- 
turbed states, which correspond to comparatively high energy-levels, 
where the energy spectrum of the perturbed atom becomes practically 
continuous. In the case of the unperturbed atom, the continuous 
spectrum starts at the point where the energy is equal to zero, while 
for a perturbed atom it starts below this point—and indeed the more 
below, the larger the perturbing electric field. 


21. Perturbation Theory involving the Time; General Processes; 
Theory of Transitions 
In all the foregoing developments the time has been completely 
ignored. This has been possible because we have limited ourselves to 
the consideration of such physical quantities as do not depend upon 
the time. It may seem, at first sight, that the introduction of the time 
as an independent variable into the expression of an operator, F',) say, 
representing some variable physical quantity, would only have the 


198 PERTURBATION THEORY § 21 
effect of making its characteristic values, and consequently the states 
specified by them, functions of the time. That this is not so is clear, 
however, from the example of the energy. If the energy operator K con- 
tains the time explicitly, then an equation of the type (K—K’)¢x = 0 
has no physical meaning and must be replaced by the general equation 


of motion (K+p)$ = 0, (160) 


where p, = ES Ss The equation (K—K')¢,. = 0 would correspond to 


the treatment of the time as a simple parameter; from the purely mathe- 
matical point of view, the appearance of the time would have no 
particular meaning, save that of making the characteristic values K’ 
and the characteristic functions ¢,-(t) definite functions of the time. 
These functions, as well as the corresponding characteristic values K’(t), 
would, however, have nothing to do with those functions ¢(z,?) which 
describe wave-mechanically the motion determined by the energy 
operator K and which are the solutions of equation (160). 

So long as K depends upon the time, this equation does not admit 
particular solutions of the type ¢ = $9-.(x)e~*#™*"", which means, from 
the physical point of view, that K has no characteristic values, or, in 
other words, that the values of a variable energy cannot be specified. 

This result constitutes one of the fundamental differences between 
wave mechanics and classical mechanics, where the value of a variable 
energy can always be ascertained as a definite function of the time. 
The same refers to other operators involving the time as an independent 
variable. 

It is true that the energy is more intimately connected with the time 
than any other operator. It seems, however, doubtful whether an 
equation of the form Ky = F’y defining the characteristic values of 
an operator F,) has any meaning if K,, depends upon the time—so 
long at least as the latter is treated on an entirely different basis from 
that of the coordinates x,y,z. The exceptional role of the time is revealed 
by the fact that, in contradistinction to the coordinates, it cannot be 
used for the specification of the states, the latter being referred, in 
general, to a particular instant of time. The time, therefore, cannot 
be treated on the same lines as the coordinates and other physical 
quantities, and, in particular, it cannot be represented as an operator 
or a matrix with regard to some other basic quantity. Even when 
completely ‘inactive’, the time remains above the realm of ordinary 
quantities, ruling out the very possibility of their determination (so far 


§ 21 PERTURBATION THEORY INVOLVING THE TIME 199 
as exact and not probable values are concerned) by its active inter- 
ference. 

Nevertheless, the transformation theory which has been developed 
in the preceding chapter can be applied in a somewhat modified and 
generalized form to variable quantities and, in particular, to the energy 
K of a particle moving in a variable field of force. 

If the variable part of K refers to a comparatively small force, we 
can regard the latter as a perturbing factor causing transitions between 
the states specified by the part of K which does not contain the time. 
This theory of transitions has been outlined already in Part I, § 14. 
We shall now briefly recapitulate it, using the new notation, and we 
shall point out its connexion with the transformation theory. 

The variable part of K, which will be regarded as the perturbation 
energy, will be denoted, as before, by S, and the constant part by H. 
The function ¢(z,t), which is the general solution of equation (160), 
can be represented as a superposition of the (normalized) functions pj. 
which correspond to the different states specified by the operator H, 
with suitably determined variable coefficients. 

Taking first the case of a discrete H-spectrum, we shall put accord- 


ingly $(a,t) = cally, (1602) 
with Yeo = Prp(a)en!2nirth 

or bat) = ¥ Cyt. (160 b) 
where Cy (t) = cy (te?rHun, (160 c) 


Substituting (160 a) in (160) and taking into account that the functions 
yx satisfy the equation (H+ p,)b,, = 0, we have 


(H+S+p,) p2 Cy(tpy = p (Pree ay + en Spy] = 0. 


Since Spy = > Saye Pars 
H 
we get > % Pal San Pen t+ew Syn’) = 0, 
whence > (8x77 DiC tey Spey) = 0, 
or, interchanging H’ and H” 
PP) eee (161) 
2m dt » seca 


It should be remembered that the quantities S,.,,- represent not the 
matrix elements but the matrix components of the perturbation energy, 
so that Sy-7,- = Shry- 2" -H'wh, Further, so long as S contains the 


200 PERTURBATION THEORY § 21 
time explicitly, the matrix elements S%,.,,- = f Y9f Sp,-dV must also 
be certain functions of the time, so that (161) can be written in the form 

_ der 

2mi dt 
If we substitute in (160) the expression (160b) instead of (160a), we 
get in the same way, without, however, separating K into the parts 
H and S, 
(K+2,) p? Cn Yi = Pa (bi A Cy + Cy KY) = 0, 


or, since Kit, = 2. Kirn Vy 


> Why > (Sp2-42° Pt Cre they Cw) ae 


i 


h dCy _ ‘ 
55 Gt =F Kien Cw. (161 b) 

This equation can be derived from (161 a)—or the latter from it— 
with the help of the relation (160c) between the coefficients C and c and 


the relations me ' x9 
WR =H Sarue t+ Sire 


As already explained in Part I, § 17, the squares of the moduli of the 

coefficients ¢,,, or Cy, i.e. the quantities 

Nyy (t) = Cy Chy = Cy chy, (162) 
can be interpreted as the probabilities of finding the particle at the 
instant ¢ in the unperturbed state H’, or, using the ‘multiplex repre- 
sentation’, as the relative numbers of the copies of the particle in the 
state H' at the instant ¢ = 0. These numbers can be determined as 
functions of the time with the help of equations (161 b) or (161) if the 
initial values of the coefficients C,,, (or c,,) at some instant ¢t = 0 are 
supposed to be known. We shall denote them in future by C%, and 
write accordingly N,(0) = N%$,. 

The change of the numbers N,;, with the time can be interpreted as 
the result of transitions induced by the perturbing forces. So long, 
however, as two or more of the numbers N%,. are different from zero, 
it is impossible to ascertain the original state from which the transition 
to a given state takes place. 

In order to be able to speak of definite transitions to a given final 
state from a given initial state, we must therefore assume that initially 
all the copies of the particle were in the same state, H’ say. This means 
that all the coefficients C%,- must be set equal to zero, with the excep- 
tion of one of them, C4,, which can be put equal to 1. This can be 


= > So pe (t eter - HM. (161 a) 
H 


or finally 


§ 21 PERTURBATION THEORY INVOLVING THE TIME 201 
expressed by means of the formula 
Che = 8yeyy's (162) 

which serves to show that the coefficients C',,-(¢), not only for ¢ == 0 but 
also for ¢ > 0, can be considered as the elements of a matrix, which we 
shall call the transition matrix and shall denote by the same letter C. 
The value of the coefficient C,,. at the time ¢, on the assumption of 
a definite initial state H’, will thus be denoted by 

Celt) = Cogonrlt), (162 b) 
the initial value of the matrix C being 5 (that is, 1). 

The formula ¢ = > C,,-(t)p),- represents the general solution of 
Schridinger’s equation (160). That particular solution of it which 
reduces to #,, at the initial instant ¢ = 0 can conveniently be denoted 
by ¢y(z,t). We thus get for particular solutions of this type, which 
approximate to the particular solutions of the equation of the unper- 
turbed motion (H+ p,) = 0, the following formula: 


oy = p Cre Pr (163) 


which shows that the transition matrix C(t) can be regarded as the 
transformation matrix from the wave functions ¢/,, to the wave func- 
tions ¢,,. The latter can no longer be denoted by ¢,, as was done 
before, since K has no characteristic values; these characteristic values 
can, however, be replaced by a kind of ‘reminiscence’ of the particular 
solutions of the equation (K+ p,)46 = 0 about the H-state they repre- 
sented at the instant ¢ = 0. 

It can easily be shown that the functions ¢,,-, ,-, etc., are mutually 
orthogonal, just as are the functions ¢,., $,- considered before. 

We have in fact 

-, hk @ ho 
(A To 5) ou = 9, (« = 5 5) gy = 0. 

Multiplying the first of these equations by ¢7,- and the second by ¢,,., 
subtracting one from the other, and integrating over the coordinates, 
we get 


| Ci Kt Koi) a = — | Fir du AV, 


or, since the left-hand side vanishes (so long as A, in spite of its depen- 
dence upon the time, preserves the property of self-adjointness), we get 


d 
£ { bedy dV = 0. 


3695.6 pd 


202 PERTURBATION THEORY §21 
We thus see that the value of the integral | ¢%,-¢,,,-dV does not depend 
upon the time. Since at the initial moment t = 0 we have ¢,,. = #,,- 
and ¢,,- = ,,-, it follows from this that the functions ¢,. satisfy, 
irrespective of the time, the same orthogonality and normalizing con- 


ditions 
as the functions ¥,,.. 
Substituting in these equations the expressions (163), we have further 


[ Bebe dV = SY Coery Choma [Wither dV, 
that is, | tedu dV = % Cun Chen 
if 
and consequently ps Cheon: Cry = baep. (163 b) 
if 


This equation shows that the transition matrix is unitary (Ct = C-), 
just as are the ordinary transformation matrices, which have been con- 
sidered in the preceding sections and which do not depend upon the 
time. The transformation equations (163) can be written accordingly 
in the ordinary matrix form 
d= Cy, or opt = yorct. (163 c) 

It follows from these results that the functions ¢,,- specify perfectly 
definite states in the same sense as those which would be represented 
by the functions ¢x- if K were independent of the time and had definite 
characteristic values; the only difference between them being that the 
former vary with the time while the latter should remain constant. 

The set of states specified by the functions ¢, can be represented 
geometrically as an orthogonal system of coordinates in the state-space, 
the transformation coefficients C,-,,, denoting the cosines of the angles 
between the fixed axes which represent the states 2,- and the movable 
axes which represent the states ¢,,. This movable system of axes, 
rotating like a solid body in the state-space, can be regarded as the 
geometrical representation of the variable energy K. 

One might be inclined to go a step further and to represent K by 
a quadric surface defined by the equation 


p>»? Ky ye OF Oye = const., 


thus fixing not only the directions but also the lengths of the axes asso- 
ciated with K—i.e. the characteristic values of the latter. This argu- 
ment is, however, fallacious because the preceding equation has nothing 
to do with the representation of the variable surface K, which we have 


§ 21 PERTURBATION THEORY INVOLVING THE TIME 203 
been considering, but represents in reality the fictitious ‘quasi-constant’ 
energy operator K with the time treated as a simple parameter. 

The fallacy of the above argument becomes especially apparent when 
K is actually constant (which can be considered as a special case of 
a variable K). The equation > > K%,,.a},a,- = const. will then 
represent K as a quadric surface fixed in the state-space. Nothing, 
however, will prevent us from solving the equation (K+ ,)¢ = 0 in 
this case in the same way as in the preceding case, namely, by taking 
particular solutions not of the usual K-type, d, = ¢%-e-#7 KE, but of 
the H-type, i.e. such that, at the initial moment t = 0, ¢ coincides with 
one of the functions ;,. The functions ¢,,- so obtained will represent 
for t + 0 states entirely different both from those specified by the 
functions ¥;,, and from those specified by the functions ¢,._ In order 
to avoid confusion, we shall denote the characteristic functions of K 
(when they exist of course, i.e. when KA is independent of the time) by 
Xx’ instead of ¢,.. The connexion between these functions and the 


functions yb), 
unctions #,, Xoo = > an | an (164) 


which has been investigated before, is represented by a constant 
transformation matrix a, which has nothing to do with the variable 
matrices C and c. 

It should be remarked that the elements of these matrices are con- 
nected with each other, according to (160c), by the relation . 


CH-He = Cen et27H Uh 


which is not symmetrical with regard to the two indices and is in 
agreement with the unitary character of the two matrices. 

The transformation matrix a can be derived from the general equa- 
tions (161 b) if the condition that the function ¢ should reduce to py, 
for t = 0 is replaced by the condition that it should be a harmonic 
function of the time of the type 


b= xg = Xone F2VE"th, (164 a) 
This means, on account of the equation 
$= 3 Cur, 
that all the coefficients C,,, should also be of the type 
Cg = Chet km, (165) 


The differential equations (161 b) reduce, subject to this condition, to 


204 PERTURBATION THEORY § 21 
a system of ordinary algebraic equations for the amplitudes C%,, 
K"Cyy =F Kn Op, (165) 
Fr 


which are obviously identical with the equations determining the 
transformation coefficients a. 
We thus get we = ay’, 
or more exactly Cor K = By (165 b) 
The relations between the functions yx = x}-e7%7%4" and 
Pry = W8,-e-t27HUh can be obtained from (164) if the coefficients @,,-,- 


are replaced by LC rpeges = gy pe CHAM —BMM, (166) 
These coefficients also constitute a unitary matrix ¢. Combining the 
matrix equations 
ini x=9f and $= Yr, 
we can easily obtain a direct relation between the functions ¢ and x. 
We have, ly, 
namely if a yf-? = yt, 
and consequently ¢ = xd, 
with the transformation matrix 
ad = Cte. 
Written in matrix elements, these equations run 
oy = 2 dwn XK (166 a) 
with yop: = 3 hen Orn = > 67 gC pee py 
or Pr = Da a Cyr yy et 2n( KOH Nh (166 b) 
if 
Putting dh yey == QK*H’ ein K'th | 
we can rewrite (166a) in the more convenient form 
ou = pa DK! XK (166 c) 
with QRH’ = Oye Crean (166 d) 


showing that the dependence of ¢,, on the time is fully determined by 
the transformation coefficients Cy.-7. 

Equations of exactly the same type as (165a) are obtained in classical 
mechanics for the amplitudes of the free oscillations of a system of 
particles held together by ‘quasi-elastic’ forces, i.e. forces which are 
proportional to their displacements both from the respective equilibrium 
positions and relative to each other. Such a system can be realized in 
the simplest form by a set of coupled pendulums which can oscillate 
in-a definite plane under the influence of gravity and of forces due to 


§ 21 PERTURBATION THEORY INVOLVING THE TIME 205 
their being coupled together (by means of lateral strings or otherwise). 

Let &,, &,... be the displacements of the given particles—or pendu- 
lums—from their position of rest. Their dependence upon the time is 
determined by a system of equations of the form 


2 
—*s es p Dam em? (167) 
The coefficients ®,,, thus specify the binding of the separate particles 
to their positions of rest, and so determine the free vibrations which 
they would carry out in the absence of any coupling with the other 
particles. The coefficients ,,,, = ®,,, (m + n) describe, on the other 
hand, the perturbing coupling forces. 
If we put 


Dix = OF +O in Dm = Din (n aa m), 


na 
we can then regard the above equations as the equations of the perturbed 
motion of the given quasi-elastic system. By the unperturbed motion 
we are to understand the vibrations determined by the equations 


dé, — dq 
a dt2 a Din Sn 


In this case each particle (pendulum or current) vibrates quite indepen- 
dently of the others and with a frequency w? = = @°..- 

In the presence of perturbing coupling forces such independent 
harmonic vibrations of the separate particles (or pendulums) are not 
possible. They become replaced by harmonic vibrations of a different 
kind—so-called ‘normal vibrations’ of the system—in which with regard 
to any kind of vibration characterized by the common frequency w,. 
all particles participate with definite relative amplitudes and definite 
phase differences. The real amplitude and the initial phase (at time 
t = 0) of each particle can be defined respectively as the modulus and 
the argument of a complex amplitude y, = |y, |e». These complex 
amplitudes and the corresponding frequencies of vibration can be deter- 
mined from the equations of motion if we make the substitution 


En = Yn eo, (167 a) 
for the variables é,. Equations (167) then reduce to the form 
2 Damm = Wyn; (167 b) 


+ Instead of a mechanical model we could use, for the illustration of the equations 
(165 a), an electric model, formed by a system of electrically coupled electric circuits. 


206 PERTURBATION THEORY § 21 
and thus with w* = K” and ®,,, = K},-- become identical with the 
“wave mechanics’ equations (165 a). 

The general solution of the classical vibration problem (167)—just 
as of the corresponding ‘wave mechanics’ problem (K+ p,)x = 0—is 
obtained by superposition of all harmonic particular solutions (with 
arbitrary constant coefficients). 

The similarity of the two problems enables us to relate the perturba- 
tion theory of quantum mechanics, in a very clear manner, to the 
classical theory of weakly coupled particles or pendulums. The ‘pendu- 
lum model’ (which can serve just as well for the illustration both of the 
wave-mechanical and the electromagnetic vibrations) proves to be 
especially convenient. Such a model consists of an infinite series of 
pendulums which are suspended along a horizontal line in the order 
of increasing frequencies of the unperturbed vibrations, i.e. in the 
order of decreasing lengths, and which can be bound to one another 
in pairs (see Fig. 2). Thus each pendulum corresponds to a definite 
quantized state of the unperturbed system (atom, molecule), i.e. to a 
definite characteristic function #9, In the case of ‘degeneracy’, i.e. 
when several different pendulums have the same unperturbed vibration 
frequency vf, = H’/h, we can ascribe the sanfe length to the corre- 
sponding pendulums (in general, however, a different mass) and place 
them beside one another transversely to the original direction of 
suspension. 

If, under the given conditions of the motion, there exists, besides 
a discrete set of states, also a continuous set of stationary states, then 
the discrete pendulum series of our model must be supplemented by a 
continuous series, which can be conceived as a compact heavy fabric. For 
this fabric not to tear, the amplitudes and phases of the vibration of 
its vertical elements must be continuous functions of the (unperturbed) 
vibration frequency v° = H'/h.t 

From the point of view of the wave conception, the correspondence 
between the vibrations of our pendulum model and the vibration 
process in the corresponding mechanical system is very straight- 
forward and suggestive. Thus the different types of standing waves 
represented by the functions 4%. play the role of the single pen- 

t We could replace the pendulum model by a string model (limiting ourselves to the 
fundamental vibrations of each string). The continuous spectrum in this model would 
be represented by a membrane. Such a membrane must, however, possess quite unusual 
properties which are incompatible with the ordinary equations of the theory of elasticity 


(for these equations correspond to @ coupling between the neighbouring elements of the 
elastic continuum only), 


§ 21 PERTURBATION THEORY INVOLVING THE TIME 207 
dulums; while the coefficients C,,- (or cy-) are the (complex) ampli- 
tudes of vibration. 

This correspondence acquires a purely symbolic character, however, 
when we go over from the wave picture to the corpuscular picture. The 
amplitude coefficients then acquire a quite different physical meaning; 
for their norms C;,-C},- = |Cq-|? then determine the relative number 
of the copies of the given particle which are in the corresponding state. 
To the continuous alteration of these coefficients with the time under 
the action of the perturbing forces there corresponds a series of forced 
transitions of these copies from one state to another. The derivative 
d|C',-|?/dt then gives the probability, referred to unit time, that any 


Fig. 2. 


copy of the particle will go over into the state %,- if d|C',-|?/dt > 0 or 
out of this state if d|C';,-|*/dt < 0. 

One important difference between the pendulum model and the 
wave-mechanical vibrations it represents, consists in the normalization 
of the amplitudes of vibration to a definite value (1). A system of 
pendulums, as considered in classical mechanics, can be at rest; or if 
the system is vibrating, one has to distinguish not only the relative but 
also the absolute values of the amplitudes. So far as this model is used 
for the illustration of wave-mechanical vibrations a state of rest is 
excluded—for the particle must always be found in some one of the 
states represented by the pendulums. Moreover, only the relative values 
of the amplitudes have a physical significance as defining the probability 
amplitudes of the corresponding states—which can be taken into 
account by normalizing the sum of their norms once and for all to 1. 

In the case of certain relations between the amplitudes y, of the 
various pendulums, these amplitudes can preserve constant values, as we 
have seen above. Such ‘normal vibrations’ of the system of pendulums 
correspond to stationary distributions of the copies of the particles 


208 PERTURBATION THEORY , § 21 
among the different unperturbed states, and represent the stationary 
states in the presence of the perturbing forces (i.e. states defined by 
the energy K). If we introduce for the illustration of the perturbed 
motion, i.e. of the vibrations defined by the operator A, a pendulum 
model of the same kind as for the unperturbed motion (i.e. the H- 
vibrations), then any such stationary distribution, i.e. any normal 
vibration of the original model, will be represented by the vibrations 
of a single pendulum of the new model. These new pendulums, repre- 
senting the transformed characteristic functions y,-, must clearly be 
considered as uncoupled. This means that transitions between the new 
stationary states (which are the real stationary states) are impossible. 

A transition between two different unperturbed states H’ and H” is 
possible in the first place if the corresponding matrix element of the 
perturbation energy S},;,- is different from zero. The coupling coeffi- 
cients ®/,,,, which represent these elements in our pendulum model, can 
be regarded as a measure of the probability amplitude for transitions 
between the corresponding states. It can easily be seen, however, that 
transitions are also possible between unperturbed states H’ and H” 
which are only indirectly coupled with each other, the matrix element 
So, Vanishing, but certain other elements of the type Sf, and 
Spey being different from zero. Such ‘indirect transitions’ play, as 
we shall see later on, an important role in many physical phenomena. 

In the case of the stationary K,-states represented by a stationary 
distribution of the copies over the various H-states—or by normal 
vibrations of the pendulum-system—the transitions between different 
H-states can be imagined to be mutually compensated. 

The variable K,,-states which are described by the functions ¢,,- can 
be represented in our pendulum model by vibrations which at the initial 
time ¢ = 0 involve one particular pendulum (H’) only. As time goes 
on, the vibrations of this pendulum must be gradually transferred to 
other pendulums, this transference representing the gradual transition 
of the copies of the particle from the state H’ in which they were 
initially supposed to be concentrated (whose probability, in other words, 
was initially equal to 1) to other states. 

If the energy K, or what amounts to the same thing the perturbation 
energy S, depends upon the time, only K,,-states of this type can be 
defined and represented by means of the pendulum model, while normal 
vibrations corresponding to definite values of K are impossible. 

It is natural to consider vibrations due to an external influence, 
specified as a given function of the time, as ‘forced vibrations’. It must 


§ 21 PERTURBATION THEORY INVOLVING THE TIME 209 
be borne in mind, however, that the forced vibrations we are referring 
to are not of the usual type described by the non-homogeneous equations 
2 
“ aii bs Oimém+F,(t), 

where F,(t) denotes the external force acting on the nth pendulum. 
Such external forces do not have any place in our model. They are 
replaced by a so-called ‘parametric perturbation’, i.e. by a change of 
the parameters ®,,,, which determine the free vibrations of the pendu- 
lums. In fact, the case of a perturbation energy depending upon the 
time can be represented, in the pendulum model, by a type of forced 
vibrations determined by the equations 


de =F bP DrinltEm is 5 ps Din (t)Em: 


The model will, however, adequately reproduce the actual conditions 
only when the dependence of S upon the time is harmonic and if, 
besides, we restrict ourselves to the case of small perturbing forces; 
otherwise the agreement between the wave-mechanical equations (161 a) 
or (161 b) and the classical equations will be destroyed on account of 
the fact that in the former we have first derivatives with respect to 
the time (multiplied by h/277i), while in the latter we have second 
derivatives (dé,,/dt?). This difference is immaterial only in the case of 
harmonic vibrations represented by exponential functions of the type 
e*27 the differentiation with regard to the time being in both cases 
equivalent to multiplication by a real constant. 

The preceding theory can easily be extended to the case of a con- 
tinuous or mixed energy spectrum of the unperturbed motion. 

Writing, for example, 


Het) = SenlOWat | cu dH” (168) 
t 
instead of (160a), we get 
(2+S+p,)6 
= 3 [(ren Mate She} [ [cn Mae ten- Sy] dH” = 0. 
We have further 
Shy = baw Pret J Sarew baer dh”, 


Shy sas Sree Pat i Speen Pye dH™, 


3595.6 re 


210 PERTURBATION THEORY § 21 
where H’ and H” refer to the discrete and H” and H” to the continuous 
region of the H-spectrum, and consequently 
h dey, we 
<= ma = Sanne taet ( Say Cy dH 
(168 a) 


_ wn 
oni dt = Oi Suen Cpe + i} Saeyene gen dH 


The only difference between the discrete and the continuous case is 

that in specifying the states we must, in general, replace the discrete 

values of H’ by elementary regions or ranges of H”, the number of the 

copies belonging to the range AH” being equal to \cz7-|2dH”—provided 
aH 


the functions ¥,,- are duly normalized according to the equation 
| vr Pye dV = 5(4”—H”™) or J Pre P pyr d ” —}), 


It should be remembered that this condition is equivalent to the usual 
normalizing condition f |),,-|2dV = 1 for the quasi-discrete functions 


~ : ] > 
bye = im + AH) | $y dH". 
(AH”") 


With the help of the latter the case of a continuous spectrum can be 
dealt with in exactly the same way as the discrete case, provided we 
start with finite ranges AH” and pass to the limit AH” > 0 after having 
calculated the coefficients c. 

The actual determination of the perturbed motion by the method of 
transitions explained above, both in the case of a variable energy A 
and in the special case of a constant K, can be carried out by means 
of a process of successive approximations, based upon the following 
consideration. If there were no perturbation, then the coefficients c 
(but not C!) would remain constant, preserving those values c°® which 
they were supposed to have at the initial moment ¢ = 0. The action 
of the perturbation will be to modify these values, so that we can put 
c(t) = c®+Ac(t) and consider Ac(t) as a small quantity—for sufficiently 
weak perturbing forces and, in general, for sufficiently small values of t. 
The latter condition constitutes an important restriction of the validity 
of the approximation method in question—a restriction that does not 
have any equivalent in the alternative method dealing with stationary 
states and not involving the time (if K does not depend upon the time). 

It is, however, perfectly natural from the physical point of view, 
since, in the determination of transition probabilities, we have to limit 
ourselves to short intervals of time. Regarding the matrix components 


§ 21 PERTURBATION THEORY INVOLVING THE TIME 211 
Siz #5 small quantities of the first order, we can put 
Cry(t) = C9 +A, C77-(t) + Agcy (t) +... 
and obtain the corrections A,c, A,c, etc., by the usual scheme of suc- 
cessive approximations. 
Confining ourselves again, for the sake of simplicity, to the case of 
a discrete spectrum, we obtain a chain of equations starting with 


hd 
(first approximation), 
hd 


(second approximation), and so on. Since the matrix components S,,-;,- 
are known functions of the time, equations (169) can be integrated 
directly with the result 


t 
Arew(t) = — =D eee | Siar a (170) 
H’ 0 


which, on substitution in (1624), gives 
t "4 
4 ‘ ‘ ‘ ” n” 
Agcy (t) = — > dee | a Sirnett) | at Suene(t”). (170a) 
ar 0 0 


In a similar way one can obtain an expression for A,,a,,(t) which is of 
the nth order with respect to the small quantities S,,,-, etc. 

The function S can usually be represented in the form of a product 
of a function of the coordinates and a function of the time: 


S= T (x,y, 2)f(t), (171) 
or more generally as a sum of terms of this type. We get accordingly 
Siar = Tira ferret, (171 a) 
t 
and [ Srl) dt! = Tra Srgee(® (171 b) 
0 


where »,,- ,- == (H’—H”")/h and 
t 
fA) = | feet" ae’, (171¢) 
0 


This function can be defined as the amplitude coefficient in the Fourier 
integral representation of the function f(t’) within the interval 0 < t’ <#, 
or more exactly of a function which is equal to f(t’) within this interval 


212 PERTURBATION THEORY § 21 
and vanishes outside it. The latter function 


ft) = | faertom* do 


can replace the actual function f(t) so far as we are interested in the 
results produced by the perturbation S during the limited time ¢. 
Turning to the quantities N,,, = |c,,-|*, we get 


Nae = lehge|?+ (Coe Ay Ce +e} Ay chy) + JAY eye |? + 
+ (c8® Ag cy-+-09,- Ag ct-)+.... (172) 


Terms of higher order will not be needed in future and have accordingly 
been dropped. In the particular case when c?,- = 0, this expression 


If initially the particle were supposed to be in a definite state, H’ say 
(so that cjy- = Chy-y = 8y-7), equations (170) and (170a) reduce to 


t 
a 
Asean = — { Suen dt, (173) 
0 


(with H” and H’ interchanged) and 


t tv’ 

2 

Ae Cyey = = b | dt! SyeyAt’) | dt" Sy (t’). (1738) 
ae : 

These equations give the first and second approximation for the 
elements of the ‘transition matrix’ c;,-;,._ We need not consider here 
their geometrical representation (as determining the angles between the 
fixed H-axes and the rotating K,,;-axes in the state-space), since it is 
identical with that of the transformation coefficients a, discussed in § 19. 

It is also hardly necessary to point out the way in which the pre- 
ceding equations can be generalized to allow for the presence of a con- 
tinuous or mixed spectrum; all we need to do in this case is to replace 
the sums wholly or partially by integrals extended over the continuously 
variable parameters. 

The equations (173) and (1734), as well as the higher approximations 
for C,, can be obtained in a more straightforward, though somewhat 
symbolic, way by considering the coefficients c,.,-(¢) as a matrix and 
writing the equations (161), which serve to define them, in the matrix 


form : 
Ba ge (174) 


§ 21 PERTURBATION THEORY INVOLVING THE TIME 213 
We thus get, treating S as an ordinary function of the time, 


¢ 
= f sat 
c(t)==e “9% (0), (1744) 


t 


or putting, for the sake of brevity, = { S dt = R and expanding the 


0 
exponential in a power series 


c(t) = (1—iR—-}R?+...)c(0). (174b) 


This formula contains the two equations (173) and (1734) as corre- 
sponding to the terms of the first and second order in the expansion. 
It is self-evident that all the multiplications must be carried out in 
the order stated, according to the general rule of matrix multiplication, 
and that, moreover, the matrix c(0) must be defined as the unit matrix 


Cry (O) = Spry 
It may seem at first sight that there is a discrepancy between the 


expression (173) and the second-order term of (174 b) 
Agen = —i(R8) yy = — p> Ryne Ryn 


Pd t 
. 2a .* ’ " f ” 
1.€. Ag Cy =—_ hz > | Saye” dt | Sipe’ dt e (174 c) 
H’” 
0 


0 


As a matter of fact, they are easily seen to be identical (by a generaliza- 
tion of the well-known relation for multiple integrals with the same 


variable). 
Since the first factor in (174a) is a pure imaginary, we get at once 
the relation ct(t)e(t) == ot(0)e(0) = 8, 


which means that > |¢,-,-(¢)|? = 1 in agreement with the elementary 
Hf 


theory of Part I (§ 18) or with the formula (163 b) of this section. 

It should be mentioned, in conclusion, that the case of a variable 
perturbation can be dealt with by a method similar to that of Born 
for the case of a constant perturbation in the theory of stationary states 
(§ 20). We can, in fact, determine the functions ¢,,, which are the 
particular solutions of the equation (H+S+p,)¢ = 0 reducing to 
oy = 9, at the initial instant ¢ = 0, by putting 


bn = $y tO qt Andy t+. 


214 PERTURBATION THEORY § 21 
and integrating successively the chain of equations 


(H+p)A py = —Shy, 
(H+ -p,)Asty = —SA, pz, 


etc., subject to the condition that A, pz, = A,wy = ... = 0 fort = 0. 
This method can be advantageously applied in the case of continuous 
spectra. It is, of course, completely equivalent to the method explained 
above, differing from it only by avoiding the use of the coefficients c. 


22. First Approximation; ‘Theory of Simple Transitions 

The study of transitions produced by a perturbing force can con- 
veniently be divided into two parts, corresponding to the first and to 
the second approximation of the general theory. The first-order terms 
determine the probability of simple (or direct) transitions between two 
states, which have been dealt with already to some extent in Part I, 
§ 18; while the second-order terms mainly determine the probability of 
combined transitions, involving intermediate states. 

So far as the action of variable forces is concerned, we shall restrict 
ourselves to the case of a harmonically oscillating force represented by 
the expression (171) with f(t) = cos(2mt+ 8). In the general case of 
a force represented by a sum (or integral) of terms of this form with 
different frequencies v, A, ¢z-7, reduces to the sum (or integral) of parts 
corresponding to the separate harmonic terms of S. 

Putting f(t) = }fe%@+P)+4 e-i2+B)], we get, according to (170), 
(169b), and (169c), 


A Lm ip pitn(H"—H'+hv)tih__ , ei2n(H’—H —hv)tih__] 
Crepes = ep ie ——————————————— e~ ery er ae | | 
1YH"H 24 H°H | H’—H'+hy + H’— Hi —hy 
(175) 
which can also be written in the form 
] : etry gg 4 ) a l ; eter ge gv l 
Ae Ope oe se OP eee eee, (ISS 
sai 2h ren | Yarn tv is - lla : 


involving the transition frequencies v;,-;,, = (H”—H’')/h instead of the 
energy values. 

As pointed out in Part I, § 18, these expressions, regarded as func- 
tions of the time, have two entirely different characters depending upon 
whether the absolute value of the transition frequency v,,-;,;, coincides 
with v (‘resonance’) or not. 

In the latter case A,cz-;, oscillates about the value zero, while NV ,,-, 
as determined by (172a) (for a state H” different from H’), oscillates 


§ 22 THEORY OF SIMPLE TRANSITIONS 215 
about a small (positive) average value 
~ 1 ] 
ee: Se | Le an eee 
Nu Ty H | arama 
representing the average number of copies of the particle in the initially 
vacant state H”. 

In the case of resonance (v = +v,-4) one of the two terms in the 
square brackets becomes infinite, which means that a stationary dis- 
tribution is impossible, i.e. that the number of copies in the state H” 
is steadily increasing. With the help of the formula 


(176) 


t2rbt__ 
eft} og 
§+0 
we get in this case, according to (175), 
Ay Ce = — x T9,+x[ et P2nit-+- periodic term] 


(the positive sign referring to H” > H’ and the negative sign to 
H”" < H’), that is, dropping the periodic term which remains sinall 


while ¢ increases: n 
Nu a= ja Pie |*F. (176 a) 


A perturbing force is usually said to induce transitions from the state 
H’ to H” only when these transitions are manifested as a systematic 
increase of NV,,- with the time, i.e. in the case of resonance. In the old 
quantum theory the resonance or frequency condition was regarded as 
the expression of the law of the conservation of energy on the assump- 
tion that light of frequency v can be absorbed or emitted in energy 
quanta of the magnitude hy. We see that this relation is by no means 
confined to light, being valid in the case of harmonic oscillations of any 
kind.—To the type of resonance implied there corresponds in our 
pendulum model not ordinary resonance between the external force and 
the free vibrations of a definite pendulum, but what in classical 
mechanics is denoted by ‘parametric resonance’, which means the co- 
incidence of the frequency of the variation of the coupling S%,.,,. 
between two pendulums H’ and H” with the difference of the fre- 
quencies of their free vibrations (corresponding to the absence of the 
coupling). It can, in fact, easily be shown that under this condition 
even a very weak harmonic variation of the coupling coefficient (®,.;,-) 
must produce a steady transfer of energy from the H’-pendulum (sup- 
posed to be initially the only one set in motion) to the H”-pendulum 
while all the other pendulums H” for which the condition of parametric 


216 PERTURBATION THEORY § 22 
resonance is not fulfilled will perform oscillations of small amplitude 
without any tendency towards a steady increase. 

The quadratic increase of N;,- with the time according to (176a) 
corresponds to a transition a imei to unit time) 


Dye = i =— a [*2, 


which is itself a linear function of the time. 

This result is due to the exact coincidence between v,,,;- and v (sharp 
resonance), which is practically never realized in nature. It has been 
shown in Part I, § 18, that in the case of ‘nearly-monochromatic’ light, 
formed by a spectral line of finite width, N;,- becomes a linear function 
of the time and the transition probability I,,-,,, becomes a. constant. 
The same is true, of course, of any nearly-harmonic perturbation. 

We shall return to this question in the second part of this section 
where it will be dealt with by a different method. 

The preceding formula cannot be directly applied to the special case 
v = 0 corresponding to a perturbing force which does not depend upon 
the time. We must, namely, take into account the fact that in the 
case vy > 0 only one term of (175) is effective in producing transitions 
from the state H’ to the state with higher energy H” == H’+hy, while 
the other would be effective in producing transitions from H’ to the 
lower level H” = H’—hy» (if such a level exists). Now when v = 0 both 
terms of (175) become equally effective for the transition H’ > H” 
(more simply, the splitting of S into two terms becomes meaningless). 
We thus get 


etm”- Hh] 0yeyy, CFV aret—] 
A Ore = Sie peng = — RE (17) 
218% -n-|® 
Nye cof aS 17 
whence HW “a ("HH")? (177 a) 
if H” af i’, and Nye = fe aah an | 242 (177 b) 


if H” = H’, which is the resonance condition in the present case. This 
type of ‘inner’ resonance is faithfully reproduced in our pendulum 
model by the resonance between the pendulums representing the unper- 
turbed states H’ and H”. It will be noticed that the expression (177 b) 
differs from the corresponding expression (176 a) for the case v > 0 by 
a factor 4 in the numerator. 

The quantities S9,.,,, S%-3, etc., have the effect of slightly dis- 
turbing the resonance between the corresponding pendulums, while the 


§ 22 THEORY OF SIMPLE TRANSITIONS 217 
quantities S9,.,,, describe the perturbing coupling forces. As long as 
the latter arc weak and there is no resonance, there corresponds to the 
unperturbed vibration of each pendulum (/7’) a perturbed normal vibra- 
tion of the whole system (K’) in which this particular pendulum plays 
the principal role, while all the others only faintly accompany it. This 
state of affairs is described by the formula aj;-;.) == 8jy-~7-+A@y-g Of 
§ 19, where a,,-,-- are the transformation coefficients between the func- 
tions x3. and %,-; the small quantities Aa,,-,. represent the participa- 
tion of the pendulums H” + H’ in the normal vibration Kk’, corre- 
sponding to the unperturbed oscillation of the pendulum H’ alone. We 
might expect the quantities N,,-—or their average values—-to be equal 
to the square of the moduli of these small quantities. As a matter of 
fact, we have, according to (148)b), 


So,0z70 
Wither te a a 
1%y"K oe 
i” —H 
2 [Stru|? 
and consequently JA, @ sy ¢|? == 


which is equa] to one-half of the value of N,,- as determined by 
(177). 

This discrepancy is explained by the fact that the quantities 
|A, @7,-x-|* refer to the stationary states (xx-) of the perturbed system, 
while the quantities (177a) refer to the non-stationary states ¢,,., or 
more exactly to the initial stages in the development of these states— 
as follows from the method of approximation used in deriving equation 
(177). The limitation to the initial stages is practically irrelevant so 
long as the quantities ¢,,-;,, remain small, i.e. so long as there is no 
resonance (H” -~ H’). It becomes, however, of primary importance in 
the case of resonance, the formula (177b) being valid for small values 
of ¢ only. 

The actual conditions met with in this case can be best understood 
with the help of the pendulum model. If initially only one pendulum, 
H’ say, were set in motion, then, however small the perturbing forces 
which couple it with other pendulums, those which are in resonance 
with it will gradually acquire large vibration amplitudes (while the rest 
will but faintly accompany them as before). Resonance thus excludes 
the ‘dominance’ of one particular pendulum in the perturbed vibra- 
tions: all the pendulums which are in resonance with each other become 
equally important in the vibrations started by any one of them. 


In the simplest case of two coupled pendulums in resonance we obtain 
3595.6 F f 


218 PERTURBATION THEORY § 22 
the following well-known results: If originally (when ¢ = 0) only one of 
the two pendulums was vibrating, then its vibration energy must 
gradually go over to the second pendulum. If both pendulums are 
identical, this process goes on until the first pendulum comes to a stand- 
still and the second takes over its role. Similar beats, i.e. relatively 
slow periodic increases and decreases of the vibrations of one pendulum 
at the cost of the other, must take place with any relations between 
their initial amplitudes and phases—except in two cases: ‘symmetrical’ 
vibrations with equa] (real) amplitudes and phases, and ‘antisymmetri- 
cal’ with equal amplitudes and opposite phases. In these exceptional 
cases the vibrations maintain a stationary character, i.e. their ampli- 
tudes remain constant. The symmetrical and antisymmetrical vibra- 
tions have somewhat different frequencies, both of which are, in general, 
different from the common unperturbed vibration frequency of the 
pendulums. 

The non-stationary vibrations can be represented by a superposition 
of the two kinds of stationary vibrations. The frequency of the resulting 
‘beats’ must obviously be equal to the difference of the two funda- 
mental frequencies. 

These results can easily be generalized to any finite number, 7’ say, 
of coupled pendulums in resonance. In the first approximation their 
coupling with other pendulums can be neglected. The resulting vibra- 
tions of the resonance group can be represented as a superposition of 
r’ independent normal vibrations with different frequencies. By suitably 
adjusting the amplitudes (and phases) of these normal vibrations, a 
resulting vibration can be obtained such that, at the instant ¢ = 0, one 
pendulum only—H’ say—is in motion. The amplitudes of the others 
will then at the beginning increase linearly with the time and their 
energies increase proportionally to ¢?, this dependence being restricted 
to such values of ¢ as are small compared with the ‘beat periods’, that 
is, the reciprocals of the frequency-differences between the different 
normal modes of vibration. 

These results can easily be obtained from the general theory embodied 
in equations (1614) and (161b) of § 21. It should be remarked that, 
although equations (161a) must be used for the approximate calcula- 
tion of the numbers N,,- (for the coefficients c,,- can be supposed to 
be approximately constant while the coefficients C',,- cannot), equations 
(161 b), with the coefficients K%,.;,, which are independent of the time 
are more appropriate for the discussion of the case of resonance, because 
of their similarity to the equations which determine the vibrations of 


§ 22 THEORY OF SIMPLE TRANSITIONS 219 
a system of coupled pendulums—the only modification consisting in 
replacing — i by — a 

dt? 7 2ri dt’ 

{f the coupling between the pendulums (i.e. H-states) not belonging 
to the resonance (degenerate) set in question and those which belong to 
this set is neglected, then the quantities C,,. for the latter pendulums 
can be determined by the system of r’ equations 

= s 5c H’ an Kora Cus 


or in the notation corresponding to reels (154b), 


ig Ng 5 AY. C Gites 18h )s (178) 
ade 
With the help of the relations 
C, = c,e7trih = and K®,, — 6,,,,H'’+S8i,,, 
these equations can be reduced to the form 
=e oem = > a (178 a) 


The latter equations can be derived directly from the general equations 
(16la) in the same way as equations (178) have been derived from 
(161b), in conjunction with the condition H/,,= H,, = H' (ie. 
Sinn = S®,,,), namely, by dropping terms connecting the states which 
belong to the same energy H’ with those which belong to different 
energy-levels. We have preferred, however, the indirect derivation in 
order to preserve throughout the analogy with the classical theory of 
the pendulum model. So far, however, as the results are concerned, the 
r’ states of the same energy H’ can be represented equally well by two 
systems of r’ pendulums whose oscillations are determined either by 
equations (178) or (178 a). 

Taking equations (178a), we can first of all obtain the normal 
vibrations (i.e. the K-stationary states) by putting c, = a,,e-'734Mh 
[or C, = a, e-#7 Kt in the case of equations (178)], whereby it reduces 
to the system of equations (154b), which was obtained by another 
method in § 19. After this, the general solution of (178 a) can be written 


in the form # 
Cn = J VaAnge Amant, (178 b) 
a&-1 


where the (AH’), are the solutions of (154c) and the a,, are the corre- 
sponding normalized solutions of (154b), while the y, denote arbitrary 
constants. As already mentioned, these constants can be adjusted in 


220 PERTURBATION THEORY § 22 
such a way as to make all the c,, vanish at the initial instant ¢ — 0 
with the exception of one of them, ¢,, say. This particular set of y, 
can conveniently be denoted by y,,,. 

We have, for their determination, the system of equations 


b2 Ang Yam == Sam (179) 


which shows that the matrix y is identical with a~4 or a'. We thus 
get, writing c,,,, instead of c,,, 


= F pit OMI. 
Cum = > Fis tms& aes al (179 a) 
8 
ko fee ko 
or Con a Ps Big Ung & sam (179 b) 
x 


Multiplying these expressions by their conjugate complex, we get 
§), 
Ny = XD pts" cos=" (KKM (17810) 


where p”,” is the real part of the product a,,a* ax, dy. 


We thus see that ,, is represented as a function of the time as 
a sum of constant terms (s’ =: s) and terms oscillating with the 
‘difference-’ or ‘beat’-frequencies v,, = (Aj/—Kj{)/h = (AH{;—AH;j,)/h. 
So long as the product of the time ¢ with these frequencies (which are 
the reciprocals of the ‘beat periods’) is small compared with 1, we can put 


2 
008 Hy = 1—5(F>wt) ’ 


which gives, since N,, vanishes for ¢ == 0 (unless m — n), 


4x? 4 
N, a= ary (> > Pos" vis) 


&<s" 
This expression coincides with (177 b) if 
SS Pa vig = — [Sl 
It can easily be shown, with the help of equations (154b) and (154c), 
that this relation actually holds. We shall not, however, give the 


proof of it here. 
It may be remarked that equation (179) reduces, subject to the same 
condition or rather subject to the condition AH,t/h < 1 (for all s), to 


2r , 
Cam =o a (> yy Ong(AH ) t, 
4 


while equation (177) gives, in the case of resonance, 


127 
AyCyeyy == Cam = Sant 


§ 22 THEORY OF SIMPLE TRANSITIONS 221 
from which, by the way, it follows that 


Fr =. = Ang Ming AH’),. 


This relation can be derived from the equations 
p3 Sinn Ons = AHA Ong 

by multiplying them by a*., and summing over s. We thus get 

ps3 AHA Ong ans a p32 Sin p3 Ans aN, = 2 Shin On = Sun’: 
Further, it should be mentioned that an expansion of the same type as 
that for the coefficients c,,,,, is not possible for the coefficients C,,.,,, 
as determined by (179b), on account of the large value of the fre- 
quencies A{/h. More exactly, the approximate expression C,,,,~ lt 
would be valid for exceedingly short times only (small compared with 
the reciprocal of A’/h), which hardly come into consideration. 

The resonance between the r’ states we have just considered corre- 
sponds to an absolute degeneracy between these states in the sense of 
the perturbation theory not involving the time. In the present theory 
we need not, however, distinguish between this case and that of a 
‘relative degeneracy’ (§ 20), so long as the energy-differences (H’—H") 
between the states under consideration are smal] compared with the 
corresponding matrix elements of the perturbation energy Sj). Lf 
the ratios S,,-;;,/(H’—H") are large compared with 1 we can still use 
the expression (177b) for the probability of the transition H’—> H” 
provided the time ¢ is small compared with the reciprocal of the ‘beat 
frequency’ (H”—H')jh. In the contrary case we must limit ourselves 
to the expression (177a) for the average value of the probability of 
finding the system in the new state H”. 

We have, hitherto, confined ourselves exclusively to the case of a 
discrete H-spectrum. The moditications of the general theory which 
are necessary in order to allow for the presence of a continuous or mixed 
spectrum in a limited or unlimited range have already been indicated 
in the preceding section. They necessitate, however, an important 
revision of the approximate theory for the case of resonance between 
states belonging to a discrete set, on the one hand, and states belonging 
to a continuous set on the other (and also between states belonging to 
two different continuous sets). The essence of this revision consists in 
the replacement of the idea of sharp resonance, referring to two exactly 
determined states, by that of unsharp resonance for a narrow range or 
‘band’ of final states belonging to a continuous set. 


222 PERTURBATION THEORY §e22 

Let us consider transitions which are produced by a perturbing force 
vibrating harmonically with the frequency v. The initial state will be 
supposed to belong to a discrete set and to have the energy //’. If the 
energy H’+-hy lies in the region of the continuous spectrum (as can 
happen in the case of a hydrogen-like atom if H’ < 0 while H'+hv > 0), 
then transitions will be produced not only to the state with the energy 
H;, = H'+hy, but also to the neighbouring states whose energy H” is 
slightly different from H’. This follows from two considerations. 
Firstly, the resonance condition H” = H'+hy need not be exactly 
satisfied even when the final state belongs to a discrete set. Secondly, 
the neighbouring states of a continuous set are themselves approxi- 
mately in resonance with each other and cannot therefore be considered 
separately. We must consider instead a ‘band’ of neighbouring states 
or, in other words, a ‘wave group’ formed by the superposition of the 
harmonic waves representing them. 

According to the general theory, we obtain for the coefficient c,,- of 
the functions 4;,- belonging to a continuous set exactly the same 
differential equations as for the coefficients of the functions belonging 
to a discrete state. If the particle were supposed to be initially in the 
(discrete) state H’, then we have in both cases the same expression for 
Cy = Cy-y, namely, (175). Limiting ourselves to states in the neigh- 
bourhood of the resonance state with the energy H” = H’ = H'+W, 
we can drop the first term in (175) on account of its relative smallness, 


o tha 
‘ : et2m(H*—H'—hy)tih__ } 


Avy = —3T yy eB ar. (180) 

If the functions ¥,,- are duly normalized, the number of copies of 
the particles that have passed during the time ¢ from the state H’ into 
a range AH” about the resonance value H? is given by the expression 


Nou; = f(s eearl? di”. (180 a) 
4H’ 

Before carrying out the integration over H” we must notice that this 
integration actually refers to the energy alone if the other two para- 
meters specifying the wave functions 4,,- remain discrete (as, for 
example, in the case of the hydrogen-like atom). If one or both of 
these parameters are continuously variable, dH” must be replaced by 
the product of dH” with the element or elements of these continuously 
variable parameters.’ Leaving this case aside, we can calculate (1804) by 
integrating over the energy alone. 


§22 THEORY OF SIMPLE TRANSITIONS 223 
Since the last factor in (180) has, for not too small values of ¢, a very 
sharp maximum at the resonance point H” = H” and comparatively 
very small valucs outside the immediate vicinity of this point, we can 
replace the first factor by its value for H” = H} and extend the 
integration over the difference H”— H* from —o to +00. 
Putting, for brevity, 27(H”—H’—hv)t/h = &, we then get 


ef — 1 |? 


dé. 


2 
Naw = AIT; wh [ 


Since fe —1|* = 2(1— cos¢) = 4sin?}é and 
Asinthe a "fain 4é\2 = 
fs 4ain” =1f (“) d( hE) = 2m, 


this gives Nan = © (Ph y(t. (181) 

The probability of a transition from the state H’ into the band AH} 
per unit time is thus equal to 

Cia = = (Tisarl (181 a) 

The same result could be obtained with the help of the quasi-discrete 


functions : 
Dre = lim a ara { Pr- dH”. 
J(AA" 
v( de 


We must first consider the intervals AH” as finite and calculate the 
coefficients €,,-,, = A, €y-4 according to formula (180) with the matrix 
elements 7'},.;,, replaced by 


teu = | ET he aV ~ Tae, | AV | Vit Ty dH 


= AH") | We TY dV = AH") Toye. 


This formula is the more accurate the smaller the interval AH”. We 
can therefore use it in the calculation of the limiting value of the sum 
p32 [Agy-y7-|? = > |Acy-y\? MH" extended over a large number of in- 

if 


finitely small intervals containing the resonance value H’. This limiting 
value is obviously nothing else but the integral (180 a). 
An important example of transitions of the mixed type just con- 


224 PERTURBATION THEORY § 22 
sidered is the ionization of an atom by the action of light, i.e. the 
photoelectric effect. In this case we can put 
S = —eE,cos(2mt+), 

where EH, is the amplitude of the electric vector of the light waves, 
supposed to be parallel to the x-axis, and e is the charge of the electron. 
This gives 2 Je 

. Hin’ = meni [Cpe 1”. (18Lb) 


Let us now turn to the case v = UV corresponding to a perturbation 
which does not depend upon the time. The transition being again from 
a discrete state H’ to a continuous range of states H” belonging to 
approximately the same value of the energy, we can determine its 
probability per unit time by the formula (181a), putting 7’ = S and 
introducing the factor 4, for the same reason as in the formula (177 b) 
[in contradistinction from (176a)j. We thus get 


4n? “Hy 
Tun = 7 Ste? (H" = Hi’). (182) 


Another—purely formal—modification which must be introduced for 
the case vy = 0 refers to the notation. If the continuous spectrum over- 
laps the discrete spectrum (which is necessary for the resonance con- 
dition H” = H’ to be satisfied), we must introduce explicitly one or 
two parameters in order to distinguish the different states (continuous 
and discrete) which have the same energy. Denoting this parameter 
by Q, we can rewrite (182) in the form 


47? 
Py-9: = ee |S3r-9:; H’Q'|”- (182 a) 


If, finally, the parameter Q” is continuously variable and if a range of 
the continuous spectrum is specified by the product 
o(H”, Q”) dH"dQ”, 

where ¢ is a certain function of H” and Q” such that the probability of 
finding the particle in the above range is equal to 

lCx-9-|*o(H”, Q”) dH"dQ”, 
then the probability of a resonance transition from the sharply defined 
state H’Q’ into a band corresponding to the interval dQ” is given by 


4r? : a” “ 
Tiga = = |Stre-nq' tol’, Q") dQ". (182 b) 


The same modification applies to a resonance transition produced by 


§ 22 THEORY OF SIMPLE TRANSITIONS 225 
a harmonically vibrating perturbation. Instead of (181a) we then get 


2 
Ti9°.9° = = |T 7°; 19" |"o(H;, Q”) dQ”. (182c) 


It can easily be shown that these formulae remain valid when both 
the final and the initial states belong to a continuous set. We come upon 
this case in collision problems of the simplest type such as the deflexion 
of a particle by some field of force practically limited to a finite region 
of space, the initial and final states (‘before’ and ‘after’ the collision 
with the source of the perturbing field) being described by wave func- 
tions corresponding to the motion in the absence of this field. 

Tf, however, the final state belongs to a discrete set, then the initial 
state must be specified unsharply, i.e. by a certain range of H’ (and 
eventually also of Q’). 

In conclusion the following circumstance must be pointed out. From 
the corpuscular point of view resonance means the conservation of energy. 
The fact that perturbing forces practically produce only those transi- 
tions which satisfy the resonance condition can be regarded from this 
point of view as the natural consequence of the law of conservation of 
energy. As we have seen, however, the resonance condition is not 
strictly obeyed in wave mechanics. First of all, transitions of a non- 
systematic character are produced from the initial state to states with 
an entircly different energy, the average probability of finding the 
particle in these ‘stray’ states being given by the formula (183 a). 
Further, in the case of a continuous spectrum, the systematic transi- 
tions are governed by the condition of unsharp resonance, implying 
slight deviations from the law of conservation of energy. It thus seems 
that the latter does not strictly hold in wave mechanics. 

This conclusion is, however, wrong, for the simple reason that JZ does 
not represent the actual energy of the particle, this energy, if the per- 
turbation S does not depend upon the time, being specified by the 
characteristic values of the operator A -= /14+ 8. The resonance equa- 
tion H” == I’ is therefore merely an approximate expression of the law 
of conservation of energy which in reality should be expressed by 
RP aK. 

As a matter of fact, if the motion of the particle is described from 
the point of view of A, i.c. by means of the characteristic functions of 
this operator, then a sct of stationary states is obtained between which 
no transitions are possible, irrespective of whether K” == A’ or K” + K’. 


It is only when the motion of the particle is described from the point 
3595.6 re! g 


226 PERTURBATION THEORY § 22 
of view of H that transitions appear, produced by the neglected part 
S of the total energy K. It is precisely this ‘misuse’ of the energy 8 
which is the cause of the apparent violation of the law of conservation 
of energy. From the point of view of H, S is not a constant—unless it 
commutes with H, which, in general, is not so—and therefore has no 
definite value. It can therefore be regarded as the ‘goat’ responsible 
for the deviations from the conservation law H” = H’ in the transitions 
for which this equation is not satisfied. 

A similar consideration applies even more strongly to the general 
case in which S does depend upon the time, for in this case the values 
of the total energy K remain undetermined. 


23. Second Approximation; Theory of Combined Transitions 
The preceding considerations pave the way to an understanding of 
transitions the probability of which vanishes when derived from the 
equations of the first approximation but does not vanish when estimated 
with the help of the second approximation. 

According to equations (173) and (173a), we have this case if the 
matrix component S,,;, vanishes, while there is one or several states 
H” such that the components S,-, and S;,, are both different from 
zero. 

For the sake of simplicity we shall first consider the case of discrete 
states together with a perturbation independent of the time. If there 
is no resonance between the initial and final states, i.e. if H” 4 H’, then 
the probability amplitude, cy-;,, = A,¢z;-4-, of finding the particle in 
the state H” will remain a small quantity of the second order, and the 
square of its modulus N,,- will oscillate about an average value of 
the fourth order of smallness. If, however, H” = H’, c,,-;, will increase 
linearly and N,,- will increase quadratically with the time, which means 
that there are systematic transitions from the initial state H’ to the 
final H” via one or several intermediate states H”. For these inter- 
mediate states the resonance condition with the end states need not 
(and in general cannot) be satisfied; the fact, however, that in the 
combined transitions H’-» H” > H” the particle has to pass through 
a state with an energy H” different from the initial (and final) value 
does not in the least prevent it from making such transitions. The 
apparent violation of the energy law for each of the two ‘legs’ of the 
jump from H’ to H” can obviously be straightened out by taking into 
account the perturbation energy S not only as the cause of the transi- 
tion but also as an invisible factor in the energy balance. If, for instance, 


§ 23 THEORY OF COMBINED TRANSITIONS 227 
H” > H’, then we can imagine that the energy H”—H’, which is 
required for the first step of the transition, is ‘borrowed’ from the 
perturbation energy S and restored to it during the second step. 
The probability amplitude cy... = AyCyy, of the state H”, will 
remain small, the corresponding probability (or number of copies in 
the state H”) Nyy = |A,¢y77|* oscillating about the constant value 
2| Spy |?/(H” —H')*, while the number N,,-, though initially much 
smaller, increases with the time, and may finally become very large. 
We can visualize this process by imagining each state as a vessel which 
may be filled with a liquid representing the probability or the number 
of copies. This liquid is initially concentrated in the vessel H’ and is 
pumped by the perturbation to the vessel H” with which it is connected 
indirectly through a set of vessels H”; the liquid does not, however, 
accumulate in the latter—just passing through them and accumulating 
in H". A still better picture of this transition process is provided by 
our pendulum model, the probability or number of copics being repre- 
sented by the energy flowing from the pendulum H’ to the pendulum 
H” which is coupled with it through the pendulums 1”. The lack of 
resonance between the latter and H’ results in these pendulums per- 
forming steady oscillations of small amplitude and functioning simply 
as carriers of energy from H’ to H’. 

After these preliminary considerations of a qualitative character, we 
can pass to the quantitative theory of the double transitions. Putting 


in (1730 Faded onal 
( ) Siren = Spey eH 
and Sager = SSpngy RMU HMA, 
we get AcCyn = pa Steae Sear Sirwew()s (183) 
where 
Saewen'(t) = $5 fe et2n(H”—- HM {h fer etn’ AMV h 
mH” —H’y'[h 
=o a pitt” —n'yIne shies, 
y; 7 — 7’ , 
that is, 
eitn(H”—H'Wh__ eit” -"Mh__] 
freuen) = HAH’) GHA") (183 a) 
(f"—H’')(H”—H') (H"—H")(H”"—#’) 
In the case of resonance H” = H’ this expression reduces to 
sQnt eita(U'-H'"Nih__ |} 


fran) = h(H”—H’) —(H”—H ° 


228 PERTURBATION THEORY § 23 
Dropping the second term on account of its smallness, we thus get 
eit 
AsCyey = = 2 ea gee a. (183) 
We did not replace S9,-;,. by S%,;,-in spite of the fact that H” = H’ 
in order to indicate somehow that the tinal state is different from the 
initial one. This can be done in a clearer way by introducing the 
additional suffix Q and writing S¥)-¢- yg Siy-g,n'g instead of 
Sen Shen 
In the case of double transitions, just as in the case of simple transi- 
tions, one usually has to do with an unsharp resonance between the 
initial state and a band of continuously variable final states. If the 
energy is the only continuously variable parameter, the probability of 
transition from H’Q’ to H’Q” in the time ¢ is expressed by the integral 


Naw = [ Meca-n(t)i? dH” 


extended over the neighbourhood of the resonance value H” = H’. In 
carrying out the integration we can drop the second term in the expres- 
sion (183a). With this condition we must obviously get the same result 
as for the simple resonance transition H’ > H", with the matria element 
So -x replaced by the expression 
Siew Sheen 
E "AY 

We thus obtain for the probability per unit time of the transition 
H’ + H" the following formula [cf. eq. (182a)]: 


dn®| 89 p0 po Sponge l? — 
laren = = “HH | (H” = H’). (184) 
7 


This formula is not complete in two respects. Firstly, it does not take 
into account other parameters (Q) in addition to the energy. Secondly, 
it neglects intermediate states belonging to the continuous energy 
spectrum. If the parameter Q is discretely variable, we get, instead of 
(184), the expression 


4n? he 1 'Q*,H’'Q” ee ee ds 
H”’~ 


Dy Cro h 
2 
d wn I id renege av ; 


If Q is itself continuously variable, then the summation over Q” must 
be replaced by an integration, the element dQ” being multiplied by the 


§ 23 THEORY OF COMBINED TRANSITIONS 229 
factor o(H”, Q”), and Q” being replaced by the element dQ” with the 
factor o(H’, Q") on the right side of (1844). 

If there is a slight direct coupling between the states H’ and H’”, 
then the transition probability is determined by the sum of A, cy-y: 
and A,cy,-7;-, 30 that — of (184) we 2a 


ay Shen Shy ‘W’ 
Seen am : aa: H’ 


It often happens that the seebacetibin is due to the simultaneous 
action of two different forees—which are incoherent with regard to each 
other—in the sense that they involve independent phase-factors, over 
which one must average, with the result that all quantities containing 
odd powers of these factors vanish. 

We thus get S= F+4G, (185) 


and 


(184 b) 


a 
Pa = 


Sire? = Peer FEO hear |, (185 a) 
the average value of the product of FY,-,,, with G97, being equal to zero. 

If we consider simple transitions H’-- H” produced by the simul- 
taneous action of two such perturbations, we get-for the transition 
probability the sum of the two probabilities, corresponding to the action 
of each of the two perturbations taken separately. 

However, in the case of combined transitions, we get, according to 
(184), the following expression for the transition probability 


wr = (FY Pp t (PS Opa t+ (G Orn. (186) 
the first and last terms being obtained from (184) by replacing S by 
F or G. They represent the ‘solo’ action of the two perturbing forces, 
while the middle term represents their combined action, one of the 
perturbing forces producing the first and the other the second step of 
the transition. This combination term 
4P 1S Phe Gear tb Gene Phen? ” 
aut 2, eee (H* = H’) 

(186 a) 


turns out to be, in many cascs, more important than the two ‘pure’ 
terms. 

These considerations acquire a particular importance in the generaliza- 
tion of the preceding results for the case of a perturbation depending 
upon the time. 

Let us first assume that S reduces to a simple harmonic vibration 
without a constant term. We then have, as before, 


S = T(z,y,z)cos(2mt-+B). 


(F,; G) yoy! a 


230 PERTURBATION THEORY § 23 
Substituting this in equation (173a), we get the former expression 
(183) for A,cy-7 with 


t 
n dt! feil2a gegen tv' +B) 
Sinn = ~}a U {etn an Te 
0 


‘ 
+ elem ge gv’ BN | At” {etl2m gong +r" +B 4 gil2m(v gg -v¥-BI} 


i.e. 
1 ‘ efl2rvy met pica ] op etl2a(y gu g +¥)__ ] 
Suen Hn st qh2 e? —et* re me ne ce ent ee 
4h (Vapor + 2v)(V yyy +) Vee ty)(YyeH th) 
etery ye oped l ei2nv yz: "Q° >) l 


VaR (YAH —Y) (Vgpegee FY) (Vaq7 —Y) 


et2ty gi gt __ ] et 2a(¥ gegen vk __ ] 
Yaw Yaa t v) (Mae —Y) (Vey: +y) 
gi: Sin nee cage Ee ee 
(Vz 7p-— 2v)(Vzy"-77—V) (Vneg— YY —Y)) 


This expression clearly shows that the resonance condition vj;-;,, = +v 
(i.e. H”—H’ == +hyv) of the theory of simple transitions has to be 
replaced in the case of double transitions by the condition 

Voy = +2v or 0, 


that is, H"—H'=+2hv or 0, 
giving respectively 
__ it ett2py im 1 
exyiuerrs t. 
fu AH ~~ h H”—H’ H’+hy or h Fare Hh H"—H — =A 


These results can casily be interpreted by assuming that each step of 
the double transition H' > H” + H” consists either in the absorption or 
in the (forced) emission of one quantum hv of light—if, for the sake of 
concreteness, the perturbation S is regarded as due to monochromatic 
light of frequency v. 

This interpretation is supported by the fact that the transition 
probability as determined by the square of A,c,,-;,, turns out to be 
proportional to the square of the intensity of the light (i.e. to the fourth 
power of the electric force H,, to which S must be proportional in the 
case under consideration). This is just what would be expected if the 
probability of each of the two steps of the transition is proportional 
to the intensity of the light. 

It must be emphasized, however, that for each of these two steps 
the usual resonance condition vy... = +v is, in general, not satisfied. 


§ 23 THEORY OF COMBINED TRANSITIONS 231 
We have here the same situation as in the case v = 0 discussed above— 
an apparent violation of the energy principle, straightened out by the 
perturbation energy whose value is actually indeterminate. 

It is, in principle, quite possible for light to induce transitions whose 
probability is proportional not to the first but to the second or even 
to a higher power of its intensity. In order that such effects could be 
observed, however, the intensity of the light must be extremely high, 
in fact much higher than that with which we usually have to do in our 
laboratory experiments. For, according to these experiments, the 
transition probability, as measured by the rate of photo-ionization for 
example, turns out to be exactly proportional to the light intensity. 

We are thus entitled to conclude that double transitions produced 
by the action of light alone practically do not occur—on the surface of 
the earth at least. 

There is, however, a great variety of phenomena which can be 
described as double transitions under the combined action of light and 
some other perturbation which does not depend upon the time. 

Such combined perturbations are represented by a function of the 


oyRe S = T(x, y,z)cos(2mt+8)+ G(x, y,z). (187) 
y 

If, in the calculation of A,c,,;,-, only those terms are preserved which 

are bilinear in T and G, i.c. proportional to their product, then, instead 

of (183), we get 


AgCy-H = plTire Qeartunen (t+ Gen Ten Gu-n w(t), 


h (187 a) 
wit 
Suen’ n(t) ; 
2 
= a [ae fetter mrer inh fl eteevaran ot BY | de C2 yon yt” 
9 0 
and 
9x-H'H'(t) 
= 3 [arom rgiel’ fee {eil2ny gq ge tue +Bl4 eil2avg. ogee” -y, 
that is 
f (t) l j ef2r¥ gry tv¥__ | iB eb2rv ge gee th) 
HHH’ = —— 5 Pe a 
DE eaten Ree bee 
kM ip. et2mv gry —vt _ ] ip et2ay gig > ] | 
e~ —.-—-— — eeu ee 
(Vizex7'—¥)V xz" “H’ (V gre zy »— vv ery 


232 PERTURBATION THEORY § 23 
and 


(t) ] ip eter gua tv _ l i et2tV gry meas l 
Iaene nt) = al ey 2 
2A (vy tv) (Ye +Y) Yan (nen TY) 
‘2 eT ie 12: 7 
Leib elfen gry Vk] _¢-iB et mY a ‘A 1 
(Vppeg —¥)(Yygery —¥) Yee ge(Ygen—Y) 


The two expressions define the resonance condition in the same way 
as for a simple transition produced by the action of the light alone. 
In the case of an unsharp resonance in the neighbourhood of the 
value FH! -= I1’-Lhv, these expressions practically reduce to 
P2r(v pope iv 
ti ” on A(t) =—- Be Ls ip és a a _— ] 
vHe'n The 
= 27a a a (187 b) 
) l ip ef 2h yey oe | : 
Girne) = se" a 
ME ye FO pen FY) 
so that we get 


gizn("! ie oe l 


Be ay be i Pro Oe CSR! a AR 
Revinediae PEC ( wen pew) Cane Aen 
aS = 3° 2 ew TE wee) I HF hy 


\ 


and consequently 


‘ 


SF lige 0 10 all) 1) 
r 7 a 7 | “ T3240" epg pepe Type 
Hin = 5 4+ 
| 
HW 


° 
~ 


, (187) 


Tr —I7r H” —fl'=hv 


instead of (184). This formula should be completed to allow for transi- 
tions through states belonging to the continuous H-spectrum, and also 
for other parameters (Q) besides the energy, in the same way as (184). 

It must be mentioned that those terms—quadratic in 7 or G-— 
which have been dropped in formula (187a) have no importance so 
long as we restrict ourselves to resonance transitions of the above type. 
As shown above, they would become predominant only for transitions 
of the type H” = H’-}-2hv or H” = H’, 

An interesting feature of the expression (187 c¢) is the non-symmetrical 
character of the two terms in the brackets with regard to the frequency 
v. The latter affects the second term only, which corresponds to the 
action of light in the first step of the transition, while in the first term, 
which corresponds to the action of light in the second step, the fre- 
quency v appears only through the subscript H%. 

As an example of the application of the formula (187) we could cite 
the problem of the transformation of light into heat in gaseous bodies. 
In this case G must represent the perturbing force experienced by the 
atom under consideration due to other atoms with which it is sup- 
posed to come into collision. The complete treatment of this problem 


§23 THEORY OF COMBINED TRANSITIONS 233 
requires, however, the gencralization of the preceding theory to allow 
for the motion of all the particles which act on cach other (see Part JIT). 

Another example of double transitions of the above kind is provided 
by the phenomenon of the scattering of light which can be considered 
as a combination of two elementary acts (simple transitions)—namely, 
the absorption of a light quantum /y and the spontaneous emission of 
another light quantum hy’ corresponding, in genera], to a different 
frequency. The two acts may take place in either order—since the law 
of the conservation of energy need not be satisfied in the intermediate 
state (if the perturbation energy is left out of account). 

The application of formula (187¢) to the case of the scattering of 
light necessitates, however, two important amendments both in the 
underlying principles and in the form of the result. 

First of al] it is necessary to visualize a ‘spontancous’ transition, 
associated with light emission, as caused by some perturbation G—the 
reaction of the electron’s radiation ficld on itself, for example (see 
Part I, §18). This question has, however, no practical significance, 
since in formula (187 ¢) we have to do not with the perturbation cnergy 
@ itself—which cannot be specified in the usual way, i.e. as a function 
of the coordinates or as an operator G(v)—but with its matrix elements 
only. The latter, however, can be regarded as known, since they 
define the cmission probability for which the expression (93), § 17, 
Part I, can be used. Identifying this expression with the expression 
47? 1G pgp Po(H”)/h, we can determine the matrix elements of @ pro- 
vided the function o(H”) is known. 

We shall not investigate this question here, for it will be considered 
in detail later in connexion with a more direct theory of light-scattering 
It must be mentioned, however, that this theory leads to a formula for 
Ty747 Which differs from (187 ¢) in two respects. 


Firstly, the resonance condition H} = H’+ hv is replaced by 
HA’ = H'-+hv—hv'; (188) 


where v is the frequency of the absorbed and v’ the frequency of the 
emitted (‘scattered’) light. This result can be considered as the direct 
consequence of the energy principle. 

Secondly, taking the sign — in the denominator of the second term 
in (187c) (which corresponds to absorption of light), we must replace 
the denominator of the first term, ie. the difference H”—H’, by 
H” —H'+hy’' (which corresponds to the emission of light of frequency 


v’ in the first step of the double transition). We thus get for the 
3595-6 uh 


234 PERTURBATION THEORY § 23 
probability of scattering, instead of (187c), the expression 


2 7 « one Ge one TT 2 
ae a >. fH" FH Gi ae Then’ ; 188 
AH h = (Foe + Hh’ —H'— mM aaah; 2 | Jae hv ) ( a) 


If the incident light is polarized in the direction of the unit vector q 
and that part of the scattered radiation is considered which corresponds 
to vibrations of the electron in the direction q’, then we must put 

T= —e(r:q)E, and G= —e(r-q’)E,, (188 b) 
where r is the radius vector of the electron (with respect to the nucleus 
of the atom) and £; is a certain ‘effective amplitude’. G@ is thus obtained 
from 7' by replacing the amplitude of the external electric force by 
a certain constant, which will be determined later. 

These results can be derived from the general perturbation theory by 
replacing the spontaneous emission forming one of the two steps of the 
scattering process by an induced emission, i.e. an emission due to the action 
of a secondary light wave with the frequency v’ and the amplitude £/. 

Assuming the electron to be exposed simultaneously to the action of 
these two light waves, we have for the total perturbation energy an 
expression of the form 

S = T(x, y,z)cos(2mt+f)+7" (x, y,z)cos(27v't +f’). (189) 
This gives for the bilinear part of A,c,,-,, the previous expression 
(187a) with G = 7” but with somewhat different values for the factors 
f and g. 

Limiting ourselves to the case of an approximate resonance in the neigh- 
bourhood of the value (188) and dropping relatively small terms, we get 


Eb STV ype gy —V FV v— g: 


f (t) = ait -p) ___. 
oe ie (V5p-pp— VEY’ )(Vypegy +’) (1894) 
g (t) = 1 up'-B) ef g av tv — | ; 
ae WT ae Oe VY gen) 
which gives 
Ag Cry 34° 
— 1 pep > (Ai ie pie Tien Cyn je ar ath— 1 
4h? j Veep te’ Va” *H’ —vD Viz’ "H’ »—vty’ 
(189 b) 


and consequently 


—_ m7 Ten" Ty" H’ Ten” PH all 
Dyew aed h > (ee + +" —hy ’ (189 c) 


i.e. exactly formula (188a) with G replaced by 7’. All that remains is 
to assume a fixed effective value for EZ, in order to obtain the probability 
of scattering. 


§ 23 THEORY OF COMBINED TRANSITIONS 235 

This value can be determined in the following way: 

The unsharpness of the resonance implied in the preceding calcula- 
tions can be realized either by a transition of the particle into a ‘band’ 
AH” of a continuous spectrum, with exactly specified values both of 
vand >’, or by a transition into a perfectly definite state H” belonging toa 
discrete set, the unsharpness of the resonance being due in this case to a 
variation of v’ in a small interval Av’ about the value (H”—H’—hv)/h 
or, in other words, to the emission of a spectral line v’ of finite width. 

From the latter point of view, which we shall adopt for the present, 
we must consider instead of S’ = 7’ cos(2m’t+-8’), a superposition of 
a set of harmonic vibrations with different frequencies contained in the 
small interval Av’ and with completely independent phase constants, 
i.e. incoherent with regard to each other. 

This means that |7").;,|? must be proportional not to the square 
of the sum of the amplitudes of the component vibrations, but to the 
sum of the squares of these elementary amplitudes. Denoting ti: value 
of this sum for all the frequencies contained within the interval dv’ by 
BPO we Tt spat oe OE- a aren PED 
or if—as has been done above—the integration is extended over the 
values of the energy and not over the frequency, 


; e? ‘ . 
[Pen |? = = (0+ Q’) p72 xy |B. (190) 
The corresponding transition probability is equal to 
Pp 4 Pp y q 
en 
7 IT ene |? = se er: Q’) ex” PE}. 


This quantity must acta be identified with the probability of 
spontaneous emission (see Part I, eq. (93), § 17) 


6474 2 


A= “3a — (0° OQ’) en |*, 
whence it follows that Ek? = or An ho’, (190 a) 
Putting further T = —er-qk, 
(q being the direction of the vector E,), we get, according to (189c), 
Tyex? 
= 64n4 v'3 E22 > |" H°R” Q)(f an’) (Caren W)C per ale 
3c® S HA” —H'+hy' me H”—H'—h 


(190 b) 


236 PERTURBATION THEORY § 23 
The intensity of the scattered radiation is equal to the product of 
Dygeyr and hv’, 

If v’ is different from v, and if a direct transition from the state A’ 
to the (discrete) state H” is impossible—as assumed hitherto—formula 
(190b) describes, in conjunction with the resonance condition (188), 
the so-called Raman effect or incoherent scattering of light. If the 
state H” belongs to a continuous set, corresponding to an ionized state 
of the atom, we get the Compton effect instead of the Raman effect. 
In this case it is necessary, however, to modify formula (190b), firstly 
by allowing for transitions through intermediate states belonging to the 
continuous spectrum, and secondly by allowing for the finite speed of 
light both in absorption and emission. These corrections will be intro- 
duced later in Part III where an exact theory of the Compton effect 
will be given. 


24. Theory of Transitions for an Undefined Initial State 


The coefficients ¢,;,;—or in particular c,,-;,;,—are complex quantities, 
whose modulus determines the probability of the corresponding states 
—or the number of copies associated with the latter—while their phases 
have no direct physical significance. 

We shall sce later that these phases can be used for the building up 
of a theory, in which the copies of the particle appear as a number of 
particles of the same sort (cf. Part I, § 20). So long, however, as we 
confine ourselves to one particle only, the phases of the quantities ¢,,. 
are devoid of all meaning and must therefore not appear in the final 
equations. This means that the latter must contain only the moduli 
or the squares of the moduli of the coefficients c,,-. 

We shall apply this principle to the problem (first treated by Dirac) 
of the change in the distributon of the copies of a particle among 
different states due to a perturbation of any kind when the state of the 
particle at the initial instant was not exactly specified, so that only 
the initial values of the probabilities NVY,, were known. Our problem 
will consist in the determination of these probabilities .V,,(t) as func- 
tions of the time (for sufficiently small values of the latter). 

In this form the problem is indeterminate, for the equations of the 
perturbation theory involve not the probabilities N;,, but the proba- 
bility amplitudes c,,, whose values, both with respect to modulus and 
phase, are determined by the values of their moduli vN%,- and phases 
yi at the initial moment. In order to get rid of these phases, which 
are completely irrelevant so far as the probabilities are concerned, we 


§ 24 TRANSITIONS FOR AN UNDEFINED INITIAL STATE 237 
can average the results over them—assuming all the values of these 
phases to be equally probable. 
Taking the case of a discrete set of states, we have, according to (161), 
h dey, ; 
Meg! | ee, a Sa 
Dri dt pe SoH 


To these equations we shall add the conjugate complex equations 
ont — = Ds Shren- Ce = p> Sagar Che 


Multiplying the former by cj, and the latter by c,,, and subtracting 
one from the other, we get 


hd 


Fai qe Cw eh) = 3 San Shr en — Siew Cire): 
. d Q7t 
1.€. NH = 3 > (Siew tr Cn — San: Cy Cx). (191) 
H” 


We see that the right side of these equations cannot be expressed as 
a function of the numbers N;.-. 

One might be tempted to put 

Cre = VN gy et¥u- 

and average over the phases y;,- (and y;,-), considering all their values as 
equally probable. This would, however, reduce the right side of (191) to 
zero. In fact, we are not allowed to assume the equal probability 
of all the values of the phases y, at any time; if they were equally 
probable at the instant ¢ = 0 they will no longer be so later on. 

We shall therefore, in the right side of (191), substitute for the 
probability amplitudes c;,, approximate expressions in terms of their 
initial values—up to the first approximation, so as to obtain the second- 
order approximation for the time derivatives of the numbers N,,, (it 
should be remembered that the matrix components of S by which the 
coefficients c are multiplied are regarded as small quantities of the first 
order). 

We thus get 

CF Cry CH Chy Teh Ay Che ei Ay Cxy. 
Now we obviously have 
Aiea = D Avena hs (191 a) 
Fran 
so that 
CR = OF chy tS (Aa chief chet Me Cra Chef). 


If now we put ch, = VN9, ety (191 b) 


238 PERTURBATION THEORY § 24 
and average over the values of the initial phases y?,., y9,-, etc., regarding 
them as independent of each other and equally probable, we get 
Che Cay = Spe tA he Ny tA rn Noy 
or since Ay Chew = —Ayey- ns 
Ce Cy = Bye t+ Aye (Ny-—N},). (192) 


Substituting this in (191) and remembering that 
? 


271 
Aven ny: = — os | Sin: dt, 


v0 


we get 
aN yy 
“at 
4 vO 7 
= (Su a(t) [s Spey (lt) dt +8 y-7(t) | Sin) dt’ Joa we NG), 
that is, => Dae( NY —N3,), (192 a) 
F d 4 er a ‘ 
with Dru = 5 sl [ Sut ) dt | ; (192 b) 
0 


which is obviously nothing clse but the probability (per unit time) 
of a direct transition from the state H’ into H” or vice versa. Equa- 
tion (192a) could be obtained directly from the symmetry relation 
Dru: = Tye. Itis easy to obtain higher approximations for dN,,./dt, 
taking account of combined transitions. This would not affect the form 
of equations (192a). Instead of (192b) we should, however, obtain the 
following ears for the transition peeenntye 


| f Sw (t’) dt’ +z le Sin’ (0 fae Syn lt" yf 
(192c) 


VI 


RELATIVISTIC REMODELLING AND MAGNETIC 
GENERALIZATION OF THE WAVE MECHANICS 
OF A SINGLE ELECTRON 


25. Simplest Form of Relativistic Wave Mechanics 


All the developments of the preceding chapters were based on Schro- 
dinger’s wave equation for a single particle moving in an external field 
of force with a given potential-energy function U(z,y,z,t). 

This equation, as we have seen in Chap. I, corresponds to the pre- 
relativistic classical mechanics, which neglects the variation of the mass 
of a particle with rts velocity. In addition it does not take into account 
magnetic forces, which depend not only upon the position of a particle 
but also upon its velocity (being in fact proportional to the latter). 

Our next problem will be to find the improved form of the funda- 
mental equation of wave mechanics for a single particle—which we 
shall think of as an electron—that will take account both of the 
variability of mass and of the magnetic forces. 

It turns out that the two parts of this problem can be solved simul- 
taneously—at one stroke as it were—if in reforming the Schrodinger 
equation we let ourselves be guided by the basic principle of the 
relativity theory, namely, the equivalence of the space coordinates and 
the time (multiplied by 7c), which must be expressed by the symmetry 
of all the fundamental equations of physics with respect to both, and 
which entails the four-dimensional character of all physical quantities. 

It should be mentioned that the same principle can be applied to the 
problem of improving the equations of the classical pre-relativistic 
theory and finding their relativistically correct pre-quantum form. 

The formal] correspondence between the energy-momentum relation 
of Newtonian mechanics 


1 9 2 9 y 
—— e = _ J = 0 193 
5 Geto, +9) +U—N (193) 
and the Schrodinger equation written in the form 
] 2 
[apg eit Pi ted +U+P,|y = 0, (198) 


7 es jee ee 
Pe=s-5 Panay Baa = oH 
(193 b) 


240 WAVE MECHANICS OF A SINGLE ELECTRON § 25 
leads us straight back to that four-dimensional representation of physi- 
cal quantities, which is the formal content of the relativity theory. We 
must, therefore, so modify our origina] equations that they assume a 
symmetrical form with respect to the components of four-dimensional 
vectors appearing therein. 

lf, as will be done in future, the time is specified in the usual way, 
i.e. by the real quantity ¢ without the imaginary factor zc, this sym- 
metry will be slightly distorted by the appearance of the factor —c? 
or —1/c? in the product of the fourth components of any two vectors. 

To begin with, we must fill up an important gap in the usual defini- 
tion of the momentum-cenergy vector 

Pe ee Jy == me, GS Pils, —g-- We (193 ¢) 

—a gap which makes this definition inconsistent from the point of view 
of the relativity theory and which limits its correspondence with the 
operator-vector (193 )). 

In Einstein's mechanics of a particle with rest mass my we have, 
corresponding to the components of the momentum, i.e. 

Mot, Mot y My? 


Nv, = Jet My = ey mv, = Tye 
as fourth component of the four-vector concerned, the ‘proper energy’ 

Myc? mate | 
al(1 0c?) vac) 

Now the quantity p, in (193b) represents, not this proper energy, 
but the total energy # = mc?+ U diminished by the constant rest-energy 
myc*. For the relativistic formulation of the laws of corpuscular 
mechanics we must clearly add this constant to the energy W, i.e. we 
must put 


me? = (or Qe .-= 


—g, = BE = W+myc? = me?4U. 


In addition to this, we must regard the potential energy U as the fourth 
component, i.e. as the ‘time-projection’, of a certain four-vector and 
also take into account its space projection. This space projection G, 
which obviously corresponds to the momentum and which, just as U, 
can be an arbitrary function of the coordinates and the time, will be 
called the potential momentum. In the—so far exclusively considered— 
special case G = 0 the components of the force acting on the particle 

: aU ew 0 
reduce to the usual expressions — ——, ———_, — —. 

Ox cy Oz 

to the nature and the mathematical expression of the force due to the 
vector function G will be considered later on. We are at present only 


The question as 


§ 25 SIMPLEST FORM OF RELATIVISTIC WAVE MECHANICS 241 
interested in the fact that, by the introduction of the ‘potential 
momentum’, the quantities g.,, g,, 9, appearing in formulae (193c) must 
be defined as the components of the total momentum mv+G just as the 
quantity —g, denotes the total energy mc?+ U. 

We obtain, therefore, instead of (193c), the formulae 


Jn = MUz+Gy, y= mMYtG, 9, = mr,+G, \. (194) 
—g = me>4-U 

The components of the ‘proper energy momentum vector’ are related 

to one another, according to definition, by the relation 


(mv,)?+ (mv,)*+(mv,)? — “4 (mc?)? == —méc? (194 a) 


(which is equivalent to the formula m ==... - ) In the case 
Jit) 


G = 0 this relation can be written in the form 
(mv,)*+-(mr,,)*-+ (mv)? — (6 —U)? = —mic?. 


In the limiting case of small velocities (v/e <1) we can put approxi- 
mately (mv)? = (M9 t')? 
and “(E— CU}? = ‘ (ange? + W—U)? =~ mopc?-+- 2m (WW —U). 
2 es 
Thus the previous equation reduces to 
(19 ?,)* + (M9 Vy)? + (M9 t,)?-+ 2m_(U — W) = 0, 

which is the classical energy-momentum equation (193). It should be 
noticed that it expresses the ‘law of the conservation of energy’ when 
W (or £) is constant, which can only be the case when the function 
U is independent of the time (static field). 

We see therefore that the equation 


I! 0 
(9.—G_)*+(9,—G,)? + 9. —G)?— = (+ U)P+- mic? = 0, (194) 


which results from (194), and (194 a) represents the relativistic genera- 
lization and refinement of the Newtonian relation (193). 

From this equation we can go over to the corresponding fundamental 
equation of the relativistic wave mechanics in the same way as in the 
non-relativistic case—namcly, by replacing the vector ¢ in (194b) by 
the corresponding operator-vector p and equating to zero the result 


obtained by the application of the resulting operator to a wave function 
3595.6 I i 


242 WAVE MECHANICS OF A SINGLE ELECTRON § 25 


aie thus get Py aa. (195) 
wi 


h @ 3 h @ 2 
= (fae %) + (35 -%) + 


h @ 2 l/h @ 3 ‘ 
+(53 5-4) -alszat”) +m,c*, (195a) 


In the case of ‘multiplication’ of expressions which, besides ordinary 
quantities, also contain differential operators, the order of the factors 


must remain unaltered. Thus the ‘product’ <G,4 where the operator 
d/éx is to be applied to the function G,y standing on its right side 
differs from the ‘product’ O,-¥ by the additional term yo 


If we take this into consideration we obtain 


h @ 2 h @ h @ 
(ste %) ¥ = (aes ee 8) Sea ba 
heap ho, bh @ : 
~ F458 5 ee ge eT 
2 
mi Owe, hg hh 4 OY, 


4dr? 6x2) oi 70x «2m 


and similar expressions for the other terms in the equation. Written 
out in detail it runs, therefore, as follows: 


ap oh a 1 om 


x2 T By?" G22 C2 at? 
“£02 10,4040) 


mi (0G, eG, 1 +35) : a 
Oh ne 862 a ta a)¥— 

dn? (0 
— = (c+ Of-+ G3 U4+-mie? | =ip 


If the rest-mass vanishes (m, = 0), and if there are no external forces, 
i.e. in the case of an Einstein photon, this equation reduces to the 
equation 

7) 1a 

mitait al agen 

Ox? § dy? © a2? c? at 
for electromagnetic waves. Further, it can easily be shown that when 
m, ~ 0 and G = 0 the relativistic wave equation (196) for the special 
case of a harmonic vibration process (i.e. motion with a given constant 


§ 25 SIMPLEST FORM OF RELATIVISTIC WAVE MECHANICS 243 
energy) agrees with the relativistic equation (48b), § 13, Part I. 
In fact, if we put 0U/a=—0 and = ¥%z, y, z)e-27™, equation 
(196) reduces to the form 

4mv? 827 4r? 4n? 
Vt (Te Ut pea ge mie) = 0, 


or, with v = e/h 
’ ? 472 
Vb + fags 
which is identical with (48b), Part I. 
We shall now investigate the relation of equation (196) to the equa- 
tion of motion of Einstein’s mechanics. For this purpose we shall put 


in (196) yp = const. e27iSih (197) 
After dividing the result by (27ifh)*e**7S!" and dropping the terms which 
contain the small factor h/27i we obtain the equation 


éS8\?  (aS\? , (aS\?_— 1 (aS\? as 7) 
(FG) CV 20) aes sast voit os 


az) tay) lee) ~elae tat 


[(E-U)*—mi ct} = 0, (196 a) 


+G:+ G+Gi-5Ut+mict = 0, 


which must obviously be the relativity form of the Hamilton-Jacobi 
equation. It can be written more briefly in the form 


os 2 (a8 2 [eS 2 1/a8 2 3 
(=-4,| +(F-4,) +(2-,) ~2(5+) +mic* = 0, 
(197 a) 


and can be obtained directly from (195) if we replace the vector p in 
D by the vector g defined according to the equations 

Gz = a Ty = mt I: = aa = 2. (197 b) 
From these equations, which refer to the copy continuum of one 
particle, one can easily go over to the relativistic equations of motion 
of a given copy and, indeed, just as in the non-relativity theory, by 
differentiation of equation (197a) with regard to the coordinates and 
the time, bearing in mind the following relations resulting from 
(194) and (197b), as 


a5 Gs = MV, + oe 4u = —mce?. 


If we differentiate (197) with regard to x and divide by m, we get 


(SS =| ’ (= 20 (a Z)e. EA AS ou 0, 


az3 — Ox )"** \azdy~ dx)" \ardz oa )"?" arate Or 


244 WAVE MECHANICS OF A SINGLE ELECTRON § 25 
or, by (197b), 
dg, dx , og, dy | 2%. dz - ag dG. aG, aU 


et a uv OW, 9 _ 
ax dt ' dy dt dz dé “spe 8 8 be OR Be 
i.e. iD Sv. G=0). (198) 


‘at 


The three-dimensiona] velocity vector v referring to a definite particle 
is here no longer considered as an explicit function of the coordinates 
and the time. Therefore, its parteal derivatives with regard to x, y, z, t 
must be put equal to zero. 

The equations for g, and g, analogous to (198) will not be written 
down here. The fourth niger runs 
dy 
oe cw: -G—U). 

If the potential functions G and U are independent of the time 
(static field of force) this equation reduces to dg,/dt = 0, i.e. —g, == E 
= const. (law of the conservation of energy). 

If we split up g, in (198) into the sum of mv, and G,, we then get 


dg, = poe dx  0G,dy , 0G, dz 
dt le ae éx dt ' dy at at Gz dt 
a ,.. 0G, a. 
— at £ (mt 7) + a +v "25 — ox vy ey v; He ’ 


and consequently, 
d, _U a, (eG, a,)\_, (aG, a, 
aes) = t+0,(Ft— v( -=) (198 a) 


ox Ox = by az 


The right side of this equation must obviously represent the x-com- 
ponent of the force f acting on the particle. 


If we put 
Une, G= “A (199) 
with 

_ 0 0A, 4 ep aA, eh aA, 

Ox cet’ vay cét a @z cat 
Lad , (199 a) 

(E = —grad¢ —- =) 
24, 24y 4, OAs Op “2% 
oye’ a a Oe by fs (199 b) 
(H = curl A) 


§ 25 SIMPLEST FORM OF RELATIVISTIC WAVE MECHANICS 245 


we obtain , a e(B.+ 4H, —"H,), 

or in vector notation f= e(E += x H), (200) 
d 

and qm) =f (200 a) 


Here ¢ and A are the scalar (electric) and the vector (magnetic) 
potentials, E and H the electric and magnetic ficld strengths re- 
spectively, while e is the electric charge of the particle. A point-like 
corpuscle can thus be defined by two constants only—its rest-mass and 
its charge. 

The vector defined by (200) represents, therefore, the external force 
(so-called ‘Lorentz force’) acting on an electron or a proton which is 
moving in an arbitrary electromagnetic field. 

The time projection of the four-dimensional equation of motion, of 
which (200 a) is the space projection, has the form 


dime? . 
we )  E-v = e(E, 0, + Eyv,+E,»,). (200 b) 


We thus obtain the relation 
L mc?) =: f-v = 8 mv 
an! ¢ ) i a ) 


from which at once follows the well-known formula 
— -« ™o 
(lv? jc?) 
It still remains to find out the expressions, corresponding to the 
relativity wave equation just considered, for the quantities p (proba- 
bility density) and j (probability current density). This is done most 


simply as follows (according to W. Gordon). We first introduce the 
operators: 


, (201) 
a +4) 

by means of which we can write the relativistic wave equation (195) in 

aie (ud-+ut-+ut—uj+mi cy = 0. (201 a) 

We multiply this equation on the left by ¥* and subtract from it the 


246 WAVE MECHANICS OF A SINGLE ELECTRON § 25 
conjugate equation for %* multiplied by %. We then get, bearing in 
mind (196), the formula: 
of pee gee aot 7 =| na a ‘) 
= ax Po y Get Ty ¢ oy dy oh Gy oh" + 
O [p42 Op" deri ‘)- 
+5(uee— yh — Sa. 


12/,,06 ,ap* dni - 
— a5 (U GH + OW) = 0. 


or tua t yt") + 5 (Way dt guys) + 
+ Eytan yt yuty’) — <u yt yutye) di 


This formula can be regarded as the equation of continuity if we define 
the quantities 1 
fe = oq, Oat + gut yt) 

; (201 b) 


p= — shrub var) 


as the components of the current-density vector and the copy density 
respectively. With regard to the first, this definition is the immediate 
generalization of that given earlier. The expression for p, on the other 
hand, seems to be completely different from ~* which has been used 
so far. We can easily convince ourselves, however, by the example of 
a conservative motion, that this difference is, in practice, quite unim- 
portant. Putting 
hop a 
a —Ey, and ee Ey*, 
we obtain 
9 
Wtud-tyut yt = —Zyyt(E—U) = —=yy*(moc?+ WU), 


(201 c) 


and hence p= w(t + ar) 


myc? 
ie., in so far as the kinetic energy W—U is small compared with m,c?, 
p= by". 

With regard to the exact meaning of y*, one can easily show that 
it corresponds to the rest density. This can be seen, for example, from 
the relation p/(ynb*) = m/m, which is obtained from (201 c) if the mass 


§ 25 SIMPLEST FORM OF RELATIVISTIC WAVE MECHANICS 247 
m is introduced by means of the usual formula 
m= M+ -- oa a 
c 
26. Magnetic Forces in the Approximate Non-Relativistic Wave 
Mechanics 

If in reducing equation (196) to the form corresponding to conservative 
motion the potential momentum is supposed to be different from zero, 
we get instead of (196a) an equation which in vector form can be 
written as follows: 

(= Wy. sc) —5(E- U)—mget]} y = (202) 


Jat 
If the energy W = E—m,c? is small compared with the rest-energy 
E, = m,c*, which classically corresponds to motion with a velocity v 
small compared with the velocity of light, then we can put with suffi- 
ciently good approximation 
(E—U)*? = (EL, + W—U)*? = £2+2E,(W—U), 
neglecting the relatively small term (W—U)?, and thus replace equation 
(202), which is supposed to be exact, by the approximate equation 


(537) —2ma("—0) cat (202 a) 
| 27 
This equation corresponds to the classical equation of motion allowing 
for the presence of magnetic forces (derived from the constant potential 
G) but neglecting the relativistic variation of the mass with velocity. 
As a rule, the magnetic forces are relatively weak, so that the terms 
of (202a) which are quadratic in G can be neglected compared with 
the linear terms. With this condition, equation (202a) reduces to the 
still simpler form 
2 
(3%) - = Ay, G-—G. 72mg sUjle= 0. 


Now we have V- Gy = div GY = G-Vi+¢div G. 


It is well known further that in the case of a static field the divergence 
of the magnetic potential A vanishes, so that we have divG = 0. 
The preceding equation can therefore be written in the form 
1/h —\* h 
PRPs eng? | Een rerio rem ie —Ws)s = 0. 
a(S me V+U why 0 (202b) 
So far we have been making perfectly permissible approximations. We 


248 WAVE MECHANICS OF A SINGLE ELECTRON § 26 
are now going to generalize the preceding approximate equations for 
the case of non-conservative motion (in a static or non-static field)— 
in the same way as was done before with G = 0, namely, by replacing 
the energy W by the operator —p, (or —p,—m,c?; the constant term 
m,c? is immaterial in this case because it is absorbed by the potential 


energy). 
We thus obtain the equations 


lf he h 9 
| om, (52) =5 —6 V+ U+p|9 = 0 (203) 
for weak magnetic fields or 
1 /h : 
om eee = 2 
he (V7 U+p.|¥ 0 (203 a) 


for strong fields; these can be considered as the generalization of 
Schrédinger’s equation (193a) for the case of the presence of magnetic 
forces, with neglect of the relativistic variation of mass with velocity. 

The transition from equations (202 a) and (202 b) to (203) and (203 a) 
is certainly an illogical step, which, moreover, is in contradiction with 
the results arrived at in the preceding section. For if equations (202 a) 
and (202b) are permissible approximations of equation (202), which is 
supposed to be exact, referring to the case of motion with a definite 
energy, equations (203) or (203 a) cannot be considered as an approxima- 
tion, in the strict sense of the word, to the general equation (196). In 
fact, the latter involves a second derivative of ¢ with regard to the time, 
which we are not entitled to drop or to replace by a first derivative 
multiplied by a constant factor—unless the dependence of y upon the 
time is given by the factor e-'?7£!/"_corresponding to a motion with 
the constant energy JL. 

We have here an approximation of a kind similar to that which is 
constituted by the Hamilton-Jacobi equation with respect to Schré- 
dinger’s equation for the function S = (h/271)logw: in the latter case, 
however, it is the second derivatives with regard to the space co- 
ordinates and not to the time which have to be dropped. 

The preceding consideration does not, however, invalidate equations 
(203) and (2034) as good approximations to the truth within a certain 
range corresponding to a negligible variation of the mass with velocity. 
Apart from the fact that the validity of the relativistic equation (196) 
still remains to be proved (and we shall see later that, as a matter of 
fact, the contrary can be proved)—equations (203), and (203a) repre- 
sent a very natural generalization of Schridinger’s equation for the 


§ 26 MAGNETIC FORCES 249 
presence of magnetic forces, and must therefore describe the motion 
affected by such forces just as well as Schriédinger’s equation describes 
a motion unaffected by the latter. 

An important advantage of the ‘approximate’ equations (203) and 
(203a) over the ‘exact’ equation (196) consists in the fact that they 
fit into the general scheme of the operator theory developed on the 
basis of Schridinger’s equation, since they can be written in the same 


form, namely, (H-+-p,)yb = 0, (204) 


where the Hamiltonian or energy operator JJ must be defined by the 
generalized formula 


PF xs = (5 a c) a Sf (204 a) 
l/h 2 h 
| ae V+ ‘ 
or H om: (int¥) — oO (204 b) 


Equation (196), since it contains the square of the operator p,, can- 
not be written in the form (204)—unless we assume that it is possible 
to extract square roots of operators in the same way as of ordinary 
numbers and succeed in finding an equation linear with regard to p, and 
actually equivalent to (196). 

Leaving this question till a later section, we shall now indicate briefly 
the principal modifications of the general theory, developed in the pre- 
ceding chapters, which are necessitated by the generalized form of the 
Hamiltonian operator (204) or (204a). 

First of all, we must notice that this operator is complex (which does 
not prevent it from representing a real quantity, just as the operator 


os oy does). We must distinguish therefore the operator H from 
TT 


the conjugate complex operator H*, which determines the conjugate 
complex wave function y* by the equation 


(H*—p,)p* = 0. (204 c) 
Multiplying this equation by % and subtracting it from equation 
(204) multiplied by ¥*, we get 


| a 
ap t divs = 0, 
with the ee eee yb* for p and the expression 
h 
* a * D 
sal! (= Vv G)i+¥(— = v- —G)y (205) 


for the current ae This expression turns out to be the same for 
3535.6 Kk 


250 WAVE MECHANICS OF A SINGLE ELECTRON § 26 
the two Hamiltonians (204 a) and (204 b), and coincides with the expres- 
sion derived above from the ‘exact’ relativistic theory [cf. the first 
equation (201 b)]. 

Equation (205) can obviously be rewritten in the form 


io —pR(VS—G) = es = PIVR(S)— G}, (205 a) 
0 


where ee on 


and R(S) is its real part. In the approximation corresponding to the 
classical (Newtonian) theory of the motion of an electron in an electro- 
magnetic field, S is the action function and its gradient VS is the total 
momentum g. The difference VS—G thus reduces to the proper 
momentum m,V and the vector j reduces to the product py—just as 
in the absence of magnetic forces—as, of course, is to be expected. 

The complex character of the operator H necessitates the revision of 
some of the properties of its characteristic functions, which were 
established on the assumption that H was real. This refers, in the first 
place, to the orthogonality property which was deduced from the self- 
adjointness of H, i.e. from the formula 


| (AHh—feHf,) av = 0. 


Now in the general case of a complex H defined by (204 a) or (204b), 
this formula does not hold and must be replaced by 


| (A Hfe—feH*f,) dV = 0. (206) 
We have, in fact, according to either one of the two definitions of H, 


frHhy—feHt, 
= gag, AVL) AG WathG- Wh) 


h? h 


= ~ grim, BVA Vif Vi) — GG Vide) 
or, 80 long as div G = 0, 
f, Af.—f,H*f, = div (206 a) 
where fi. = “Uh isa) — 7 ypeoage GShife (206 b) 


If, therefore, the functions f, and f, vanish waffichaniily rapidly at infinity 
(so that the integral f f, f, dV converges), we must have equation (206). 
Putting, in particular, f, = 3, and f, = ,-, where H’ and H” are 


§ 26 MAGNETIC FORCES 261 
two different (real) characteristic values of H, we get 

| Oh baba H bh) dV = (H"—H’) [ Pied dV = 0, 
whence f btw dV =0 (HH #H’), 
as before. 

It should be mentioned that in the case of a real H (i.e. in the absence 
of magnetic forces), the characteristic functions ¥%,, neglecting the 
time-factor e-!274'h, can always be defined to be real, i.e. to have real 
amplitudes i,(x, y,z), while in the case of a complex H these amplitudes 


are complex. The orthogonality relation holds therefore only in the 
above form, and not in the form 


[Yada dV =0 or {pt yt-dV =0 


in which it can be expressed if H is real. 

It should also be mentioned that the property of self-adjointness 
expressed by equation (206a), refers not only to the operator H but 
to any operator which represents a real quantity, i.e. which is a real 
function of the coordinates and the elementary operators p,, p,, p,— 


or of the vector-operator p = a. This can easily be shown with 
the help of the relations 
gen h?\" a 
heh herh = (5%) (hh ganleh nenht) = (—Za) oho 


where 


f “1 f, oan-if, afr Q2n- "fe Of, Q2n- “fe 4, ee 


Ox2n—1 ex Ox2"- —2 ox Ox2n-8 “ ~~ Optn-2 
and fp ftfp ty, = of 
2n 2n—1 2 2n—2 2 
with fy fp SO hg Phy, 


Gxt ox Oxth-1 "az Ggin-a t 
in conjunction with 
(p?")* — p*, (p?" +1) — — pent, 
We can thus say that not only the energy operator, but any operator 
F representing a real physical quantity is self-adjoint in the sense of 
the equation : 
i Si Ffa—fa F*f, = divf,,, 


and that the characteristic functions of this operator are orthogonal 
with regard to each other in the same sense as the characteristic func- 
tions of the energy operator. 


252 WAVE MECHANICS OF A SINGLE ELECTRON § 26 
Another result which was associated with the reality of H and its 

self-adjointness in the old sense was the possibility of replacing the 

differential equation for its characteristic functions and values 


(H—H' Wy = 0 
by the variational equation 
8 f p*Hyp dV = 0, 
with the condition { ¥* dV = const. (= 1). 


Since, in the case of a complex H, the function * no longer satisfies 
the same equation as y, the preceding results seem to require a modi- 
fication. 

As a matter of fact, however, no such modification is needed, for we 


Have 5 [ yXHy dv = faytHy av + [ yrHoy av 
and, with the help of (206), 
b*Hdp dV = { SHH *p* dV, 
that is, 
8 J BH dV == | Sp*Hy dV + J SHH*p* dV = 8 { pH *p* dV. 
The variational principle thus preserves its usual form 
8H = 0, E = const. (= 1) 
with H = | p*Hy dv = | pH*y* dV, 
and E= wry dV. 


As has been already pointed out, the two equations 8H = 0, 5 = 0 
can be replaced by the single one 5H = 0 if H is defined by the formula 


H = [ y*Hy av / [yar (or [ pHey* av / fo av), 


without any normalizing condition for the function y. _ 
It should be noticed further that the two equations 5H = 0, 8H = 0 
can be split up, as it were, into the following two pairs of equations: 


f prey av = 0, f Sy%y dV = 0 (207) 
and [syHeysav=0, { y*dydV = 0, 


the first pair being equivalent to the equation (H—H’)y = 0 and the 
second to the equation (H*—H’)p* = 0. 


§ 26 MAGNETIC FORCES 253 
The preceding result can easily be generalized for non-stationary 


motion, the equation (x = 5) = 0 being equivalent to 
| oy Hi A\gay a0 (207 a) 
272 at : 


and the conjugate complex equation (x *_ os 5)" = 0, to 
TT 


h @ 

6... | ie =D: 

| By (H Int 5) a 
So long as y is the exact solution of the equation (x + = 5) = 6 

7T 
and 8* is quite arbitrary, the variational equation (2074) is nothing 
but a transcription of the ordinary differential equation of motion. The 
same variational equation is obtained, however, as the condition for 
the error involved to be permanently small,t when y is replaced by an 
approximate function of some relatively simple form y,. 

At some initial moment ¢ = ft, the form of the function % can be 
fixed quite arbitrarily. We can accordingly identify Y(t.) with y,(t)). 


Now y,(t) does not satisfy the equation (x + os 5) = 0 but an equa- 
T 
tion of the form h 2 
( +5 5) its = 0, (207 b) 


Our problem is thus reduced to that of making the additional term 
y.(t) as small as possible for any time ¢. Taking ¢ = t,+dt, we get from 


the preceding equation 
Qa 2m 


Py (to+dt) = Wy (to) — 7 Ay, (t))dt — 7 fo(ty)dt. 


Now if the function #,(t)+dt) is altered by a small amount dy, (¢,+dt), 
the function ¥,(t)) remaining the same as before, the corresponding 
variation of the correction term y,(¢,) will be 


S¥o(t9) = — Bd lte tat) 


The condition that ¥,(t)) should be as small as possible for all values 
of the coordinates can be stated as the minimum condition for the 
integral f PF(ty)%—(t9) dV and is equivalent accordingly to the equation 


J 805 (o)balto) dV = 0. 


t The argument presented below is taken from Dirac’s appendix to the Russian edition 
of his book, The Principles of Quantum Mechanica. 


254 WAVE may oe OF A SINGLE ELECTRON § 26 


Replacing here dp% (ty) by 5— ay (fy4-dt), we get 


[ 8bF lo+at alt.) dV = 0, 


or passing to the limit di > 0 and dropping the index 0 (since the above 
results must hold for all values of ¢) 


[ 8¥FWya(t) aV = 0. 


This equation means that the correction ¥,(/) must be orthogonal to 
any variation of the approximate function y,(t). Hence y, can be 
eliminated from the equation (207 b) if the latter is multiplied by 5yf 
and integrated over the coordinates, thus giving equation (207a) with 
the exact function % replaced by the approximate onc y,. 
The expressions f ~*Hy dV and { PH*p* dV for // can easily be put 
in the symmetrical form 
T l/h h 
Pf tc _ a ae *1 Uurp*| dV, 208 
FE (Bir-oM-Zieclrsowr]er, ow 
if H is defined by (204a), that is 
a l/h h : 
= | |—{—V¢}{-—— V¥*]—-G: dy, 208 
= | src ses ¥\( Ba? . i+Up] ae NONE) 
where p == y¥b* is the density of probability and j the probability 
current density as defined by (205). Using the approximate expression 
(204 b) for H, we get, instead of has 


= lla 
B= | [sn 


GYy*Vy—y Vy") + vive] dV, 
(208 b) 


which coincides with (208a) if, in the above definition of j, we put 
G = 0, thus coming back to the old definition of the current density 


= sae WHY". 


So long as the reality of the characteristic values of the operator H 
and the mutual orthogonality of its characteristic functions is unaffected 
by that change of it which corresponds to the presence of a magnetic 
field, we can preserve, without any modification, all the results of the 
preceding chapters concerning the matrix representation of physical 
quantities ‘from the point of view’ of H, the transformation theory and 
the perturbation theory. 

If the magnetic (or, in general, the electromagnetic) field specified by 


Q72 TiMg 


§ 26 MAGNETIC FORCES 255 
the vector G is relatively weak (compared with the field of force defined 
by the potential energy U), then it can itself be treated as a perturba- 
tion. Subtracting from the Hamiltonian (204b), which in future may 
be denoted by A, the usual Hamiltonian 

l J hk o\* 

= 5 (55¥) +8, 

which corresponds to the absence of the ‘perturbing’ forces specified 
by G, we get the following expression for the perturbation energy: 


S=———..G-V 209 

maa $202) 

or S=, ea. (209 a) 
a7TMy C 


where A is the vector potential corresponding to G (= eA/c) and e the 
electric charge of the particle under consideration. Putting A v= P, 
oT 


we can rewrite (209) in the form 
sa Ap (209 b) 
Moo 
The simplest application of this formula is provided by the special 
case of the action of a permanent homogeneous magnetic field (Zeeman 
effect). Denoting the field strength by 4, we can, in this case, put 
A = 19x r, (210) 
where r is the radius vector of the particle. This gives in fact 
curlA = &, 
as can be verified most simply with the help of the coordinate repre- 
sentation. 
Substituting (210) in (209b), we get 


e 
= — ----—--(MXPr)-p, 
2 cae >XT)'P 
which can be rewritten in the form 
e 
= — (EXP). 
8 eee (tr Xp) 
Now the operator rxp=M 


obviously represents the angular momentum of the electron about the 
central point (nucleus), from which the radius vector r is supposed to 
be drawn. We thus get 

e 


S = ———_  M, 
2moC 


256 WAVE MECHANICS OF A SINGLE ELECTRON § 26 


or S= —H-p, (210a) 
€ 
where p= ee (210 b) 


can be defined as the operator representing the magnetic moment due 
to the rotation of the electron about the (fixed) nucleus. 

This definition follows from the fact that (210a) has exactly the same 
form as the classical expression for the energy of a particle with a 
(constant) magnetic moment pu in a homogencous magnetic field 9s. 

If the unperturbed motion is a motion in a central ficld of force, so 
that the vector M is constant, the vector p» will also be a constant. 
Its characteristic values are equal to those of M multiplied by e/2myc. 
Taking the z-component of M and remembering that, with suitably 
chosen characteristic functions Y,,,(0,¢) = P,,,e'"?, the characteristic 
values of M, are equal to integral multiples of h/27, we get for the 
characteristic values of u, integral multiples of the quantity 

_ eh 
MH demige’ 
which is called the Bohr magneton (since it is equal to the magnetic 
momentum of a one-quantum Bohr orbit). 

If the magnetic field is parallel to the z-axis, or rather if the latter 
is chosen in the direction of the magnetic field, then the change of the 
additional energy of the perturbed states of motion compared with that 
of the corresponding unperturbed states can easily be shown to bo 
equal to the product of § by the characteristic values of »,. In fact 
the non-diagonal matrix elements of the perturbation energy 

S nimn'’'m’ = — § (2) ntmin't'm’ 
with regard to the functions y°,,, and ¥°,,,,, all vanish (which means 
that the perturbation is of such a kind as to introduce no coupling 
forces between the pendulums representing different states), so that 
the additional values of the energy AH’ reduce to the diagonal elements 
of the perturbation matrix. We thus have, in the first approximation, 


A, H! = A, Aim = S ntininlm = —§(42)atm 
,__ Heh 
or A, Hf” = Ga (211) 
This splitting up of the energy-levels by the magnetic field—or rather 
the corresponding splitting of the spectral lines due to transitions 
between energy-levels with different values of the axial quantum num- 
ber m is called the ‘normal’ Zeeman effect. Since only such transitions 


§ 26 MAGNETIC FORCES 257 
occur for which Am = 0, +1, or —1, the normal Zeeman effect consists 
in the splitting up of each line into three lines, one of which coincides 
with the original line (corresponding to the absence of the magnetic 
field), while the other two are displaced in opposite directions by the 


amount eS 
Av = — ——=—-. (21la) 


The undisplaced line corresponds to harmonic oscillations of the electron 
parallel to the magnetic field, while the displaced ones correspond to 
circular motion in the one or the other sense about the direction of this 
field. The relative intensities of these three lines for the case Al = +1 
and Al = —1 have been determined in § 13, Chap. ILI. 

We shall not discuss the Zeeman effect in greater detail here, but 
shall postpone this question until a later section where it will be dealt 
with in connexion with the complications arising as a consequence of 
the hitherto ignored ‘intrinsic’ magnetic moment of the electron 
(‘anomalous’ Zeeman effect). 

Although the preceding results have been -obtained to a first ap- 
proximation by the perturbation method, they can easily be shown to 
hold exactly—so long as the action of the magnetic ficld is represented 
by the (approximate) operator (209) or (210a). 

We have, in fact, denoting by ¢ the azimuthal angle about the z-axis 
(supposed to coincide with the direction of the magnetic field), 

_ kh @ 
3 >> Qn op’ 


_ hdv d 


and consequently i Tag’ (211 b) 


where Av is given by (211 a). 
If we now compure the exact equation of the electron’s motion 
(H+S—K')xk. = 0, 
with the equation (H—H’)p!,, = 0, 
corresponding to the absence of the magnetic field, we easily find that 
they can be satisfied by the same functions 


Xk = he = Salt Fim(8)e™®, 


if we put K’—H' = AH’ = hdvm 
in accordance with (211). Thus, in the present case, we have 
AH’ = A,H’. 


We shall consider, in conclusion, another method of dealing with the 
3595.6 Ll 


258 WAVE MECHANICS OF A SINGLE ELECTRON § 26 
effect of a homogeneous magnetic field which is very instructive in that 
it brings to light the similarity between the wave-mechanical and the 
classical theory. 

We shall write the equation of the electron’s motion in the general 
form h a 


(148+ 3,2)x=0, (212) 


and shall introduce, instead of the original coordinate system 7,y,z, 
another system, z’, y’,z’( = z), rotating about the common (fixed) z-axis 
with a constant angular velocity w. The azimuthal angle ¢’ with respect 
to this rotating system is thus connected with ¢ by the formula 

¢’ = d—ot, (2128) 
whence it follows that 


(z),~ (a), ta a = (a), ee 


Now the partial derivative with respect to ¢ in cquation (212) 
obviously refers to a constant value of ¢. Taking account of (212b), we 
can therefore rewrite this equation in the form 

hn @ kh @ 
(H+8— 3S tesa) 0, 
where 0’x/ét denotes the value of the partial derivative with respect to 
i, taken for a constant value of ¢’. This equation can obviously be 
regarded as describing the motion of the electron with respect to the 
rotating coordinate system. 
Substituting in it the expression (211b).for S, we get, since 


cy 
(=) (33) (2+ 4(ay— 2) + si axa (213 a) 


This equation reduces to that which describes the motion of the electron 
with respect to the fixed axes in the absence of the magnetic field— 
with the fixed axes replaced by the rotating ones—if the angular 
verity w is defined by gy ssl Bey: (213b) 


i.e. if the frequency of revolution is just equal to Av. 

This result is identical with that which is obtained with the help of 
classical mechanics, where it is interpreted as a precession of the electron’s 
orbit about the direction of the magnetic field with the angular velocity 
w = 2nAv (Larmor’s precession). 


(213) 


§ 26 MAGNETIC FORCES 259 
The particular solutions of the equation 


h o 
(H+ an Bx = 
corresponding to a conservative motion of the electron with respect to 
the rotating axes, are obviously the same as those of the equation 


(2 aa os 5) = 0, with ¢ replaced by ¢’. We thus have 
wT 
X= xXw = Srilt) Pin (Oeime e- 4M, 
where #7’ is a characteristic value of H, i.e. 
Xr = ah py etme — Po, e-t28a(H' +hum|[2nyfh 


This is another expression of the result x% = %,, K’—H’ = hAvm 
found by the preceding method. 


27. Relativistic Wave Mechanics as a Formal Generalization of 
Maxwell’s Electromagnetic Theory of Light 


Coming back to the relativistic theory of the motion of an electron in 
an external electromagnetic field, we have to face the following situa- 
tion. If the relativistic equation (196) established in § 25 is assumed to 
be correct, we must give up the theory of the preceding chapters, so far 
as the introduction of the energy operator H is concerned. If, on the 
other hand, we wish to preserve this theory and express the wave- 
mechanical law of motion by an equation of the type (u + - 5) = 0, 
we must replace the relativistic equation (196) by an equation or system 
of equations which are linear and not quadratic with respect to the 
operator p, = = Z . 
272 Ot 

We shall now try the second alternative, not only because it fits in 
better with our previous ideas, but also because it is more general than 
the first alternative. In fact, the order of a differential equation can 
always be increased by repeated differentiation, so that, in particular, 
from an equation of the type (H+-p,)) = 0 we can always pass to an 
equation containing the square of p, This can be done, for instance, 
by applying to the preceding equation the operator H+-p, or H—p, 
giving (H?-++ 2H p,+p?)¢ = 0 in the first case and (H?—p})) = 0 in the 
second. 

Of course we must be prepared to find that the equation of the second 
order (with regard to p,), obtained in this way, will be somewhat 


260 WAVE MECHANICS OF A SINGLE ELECTRON § 27 
different from our original equation (196). Which one is chosen will 
ultimately be decided by comparing theory with experiment. 

It can easily be shown that a single equation of the first order with 
one unknown function y, satisfying the space-time symmetry require- 
ments of the relativity theory and giving by repeated differentiation 
anything like equation (196) is a thing utterly impossible. It is, how- 
ever, possible to replace equation (196) by a system of several equations 
of the first order with as many unknown functions, which would satisfy 
the space-time symmetry condition and with the help of a second 
differentiation would assume a form similar to and, in the special case 
of free motion, identical with equation (196). We shall see, moreover, 
that this system of equations can be written in the form of a single 
equation of the type (H+ p,) = 0, where H, p,, and # are treated as 
four-dimensional matrices, or similarly, in one of the following three 
equivalent forms (P,—p,) = 0, (P,—p,)~ = 0, (P,—p,) = 0, where 
P,, P,, P, are matrix operators representing the components of the 
electron’s momentum in the same sense as H = F represents its energy. 
The possibility of writing the equation of motion in these four equivalent 
forms is the direct expression of the equivalence of the space coordinates 
and the time, which forms the essence of the relativity theory. 

The first part of our problem, namely, the establishment of a system 
of first-order equations satisfying the space-time symmetry condition, 
can be solved in a very simple way, with the help of the analogy 
between mechanics and optics, which was the starting-point for the 
development of wave mechanics and which can still be used—with 
certain reservations—as a source of inspiration. 

Equation (201 a) 

(uz-- ue +uz—uz+tméc*)p = 0 
in the case of a particle with vanishing charge and rest-mass, reduces to 
2 oe ee 1 @ 
Gat igt nae) = ° ot 
i.e. to the equation of the propagation of light-waves (in empty space) 
with the true velocity c. If the wave velocity is equal to c, then the 
velocity of the associated particles must also be equal to c, so that these 
particles can be identified with photons. 

Now, according to the electromagnetic theory of light, equation 
(214), usually denoted as d’Alembert’s equation, does not give a com- 
plete description of the electromagnetic field of the light waves. This field 
is specified by six quantities, namely, the three components £,, H,, H, 


§ 27 GENERALIZATION OF MAXWELL’'S THEORY OF LIGHT 261 

of the electric field and the three componentst H,, H,, H, of the magneti¢ 

field, these quantities satisfying the well-known equations of Maxwell: 
oH, oH, 122, 


sey Me, Eee’ = 10 
oy cz oc «Ct 
es ee Oe (cu H— be 0), (215) 
Oz dx oc Ot ot 
OH, 0H, 10k, ig 
Ox oy c at = ) 


and 
ok, ok, 10H, _ 
oy on. Te at 
— By J 2G (cute +— = ) (215a) 
ok, OB, 10H, _ 
Ox cy oc a 
To these six equations we may add the following two: 


dive — °*s aa cat (216) 
divH = it He ay ae ae = (216 a) 


The latter equations can, ead for vibrational processes, be regarded 
as a consequence of (215) and (215a) respectively. Thus, if we dif- 
ferentiate equations (215) with regard to z, y, z, and add them, we get 


© div E = 0. From this it follows—in so far as we reject purely static 


fields—that div E=0. In the same way we can derive (2164) from (215a). 
If we differentiate the left side of the first equation (215) with respect 
to the time ¢, we obtain, using (215a), 
a1¢@ d1a 1 @E, 
cycat * deca ” c att 
dH, _ ob, OK, OK,\ 1k, 
“Gy a) ele) 


<3 


oy Ox on i cet 
. 8h, , Ff, ok, OF 12K, 
~~ @y® "x2 ~s(5 dy ' oz ‘)- c2 ut?’ 
i.e., by (216), 
Cini 7 wag Nes Oe 
ax? © dy? © az® 2 at? , 


t The reader will easily distinguish between the symbol H in thecombinations H,, H/,, H: 
used here for thecomponentsof Hand thesimple H used passim forthe Hamiltonian energy. 


262 WAVE MECHANICS OF A SINGLE ELECTRON § 27 
which is merely equation (214) with y = H,. In the same way we obtain 
similar equations for the other five components of the electromagnetic 
field. We see, therefore, that d’Alembert’s equation must be regarded 
as the result of the elimination, with the help of a second differentia- 
tion, of the different field-components from Maxwell’s equations. 

This elimination is usually carried out with the help of the potentials 
A,, A,, A,, ¢ which are introduced by means of the formulae 


—_ -V$—< aN H = curlA. 


Thereby equations (2150) and (2164) turn themselves into identities, 
while equations ae ea et with the additional condition 


A, oAy 1a 
po? 3 i a 0, 
yield four d’ Alembert sae 8 of < type (214) for the components 
of the potential. 
The preceding relation leads to a simplification of the wave equation 
(196), which assumes the following form: 


om a. ay 1 ary 


ox? = cz* c® ot 


aa ae Cee 


ox "oy dz cc at 


(217) 


— Tig (42+ 4p +4 —g2+ me ‘|= 0 


h2 2 
or, in vector notation, 
‘ 1 A% 4rie poy 
Veh. 
$5 e—FE(4 eraay + 25 )— 


4r = mic ere) 
— Fas (4°-#+ ne) = 0 
This equation, written in the form (201]a), can be regarded as the 
simplest generalization of d’Alembert’s equation (214) for material 
particles (electrons) with a non-vanishing charge e and rest-mass m)— 
a generalization obtained by replacing the operators 
h a h a h @ hla 
Dai dn’ dntdy’ dnide’ Baie we’ 
h a 
Qri Ox 


left side of (214) the term 0) Mo cp. 


by the operators u, = “Ax, etc., and further by adding to the 


§ 27 GENERALIZATION OF MAXWELL’'S THEORY OF LIGHT 263 

Now Maxwell’s equations form a system of equations of the first 
order satisfying the space-time symmetry condition and implying 
d’Alembert’s equation as a corollary. We are thus naturally led to 
the conclusion that the first-order equations of the relativistic wave 
mechanics, which must replace the second-order equation (201 a), can 
be obtained as a generalization of Maxwell’s equations, in a way similar 
to that which leads from d’Alembert’s equation (214) to the wave- 
mechanical] equation (201 a). 

We shall assume, therefore, that the electron (or proton) waves can 
be described not, as so far assumed, by a scalar quantity y but by two 
vector quantities M and N which are analogous to the magnetic and 
electric field strength (H and E) respectively, and we shall seek to 
generalize Maxwell’s equations by introducing the operators uw, instead 
of ~: etc. The second part of this generalization, i.e. the introduc- 
tion of the rest-mass, we shall at first disregard, i.e. we shall put m, = 0. 

To begin with, we must notice that the generalized operators w,.,..., 
unlike the original, are non-commuiative, i.e. we obtain different results 
if we apply to any function ~ two such operators in a different order. 
For example, if we form the difference of the expressions u,u, and 
U,u,~ we obtain 


oY ih & e 4 h ap 
(ses) say oxoy a ale 4,¥) — C Anata A A ¥|- 


L eapadianag 5 cAvarias tated 


; he (eA, @A 
or, by (199 b), U,Uy—Uy U, = ~~ H, (218) 


if we omit the factor y operated upon. In a similar way we get the 
formulae 


U,U,—U,U, = et H. U,U,—U,U, = sed 
Bote tence 2mic 7” es «oS 2mc * 
he ‘ 
and also U,U,—U,U, = ——— E,, (218 a) 
271 


and two analogous formulae for the combinations (y,t) and (z, t). 
Because the operators u are not commutative, their introduction into 
the eight Maxwell equations [multiplied by 4/(27t)] in place of the 


264 WAVE MECHANICS OF A SINGLE ELECTRON § 27 
operators ma =, etc., necessitates a further modification. We must, 
namely, add to the right side of these equations extra terms of the 
form uM, or uN, where M, and N, are two new scalars; otherwise 
(i.e. when M, = N, =: 0) the eight equations obtained for the six 
quantities M,, M,, M,, N,, N,, N, would be, in general, incompatible 
with one another. In fact, if we limited ourselves to a replacement of 
the operators aa sore by w,,..., the equations obtained from (216) and 
(216a) would no longer be a corollary of the equations obtained from 
(215) and (215a) and would therefore contradict the latter. 

In writing down the generalized ‘Maxwell-like’ equations, the fol- 
lowing circumstances should be noticed: 

(1) The extra terms uM, and uN, on the right side must represent 
the space-time components of two four-dimensional vectors analogous 
respectively to the vector of electric current and charge density in the 
case of equations (215) and (216)—which will be referred to as the 
I group of Maxwell’s equations—and to the vector of ‘magnetic current 
and charge density’ in the case of the II group, formed by equations 
(215a) and (216a). 

Treating M, and N, as scalar quantities, we can define the com- 
ponents of the first vector by u, Mo, u,Mo, u,Mo, +u,Mp, and that of 
the second by u,Npo, u, No, u,No, + No. 

(2) The ambiguity of sign (+) arising in this connexion can be removed 
with the help of the fact that the two groups of Maxwell’s equations 
can be derived from each other if E is replaced by H and H by —E. 

We must therefore require that one of the two groups of the general- 
ized Maxwell-like equations be obtained from the other by replacing 
N,, N,, N,, No by M,, M,, M,, M, and M,, M,, M,, M, by —N,, —N,, 
—N,, —N,. Taking this into consideration, we obtain, as the first step 
in our generalization of Maxwell’s equations, leaving the rest-mass out 
of account, the following system of equations: 


u, M,—u, M,—u,N, = uz My 
u,M,—u,M,—u,N, = u,M, (219) 
u,M,—u,M,—u,N, = u, My 
u, VN, —u, N, +u,M, = u,No 
u, N, —u,N, +u,M, = u, Ny (2194) 
u,N,—u,N, +u,M, = u,Ny 


§ 27 GENERALIZATION OF MAXWELL’S THEORY OF LIGHT 265 
u,N,+u,N,+u,N, = —u,M, (220) 
u, M,+-u,M,+u,M, = +u,Np. (220a) 
From these equations we will now by ‘generalized differentiation’, i.c. 
by repeated application of the operators u, obtain eight differential 
equations of the second order which correspond to d’Alembert’s dif- 
ferential equation. 

If we apply the operators u,, w,, u, to the equations (219) and the 
operator u, to equation (220), we obtain by addition, using (218) and 
(218 a): 

— ent? M,+H,M,+H,M,—E,N,—E,N,—E,N,| 


== (wi--ut -+-12—u) Mo, 


or, if we put for shortness 


Dy = ui+ ue ut—uj (231) 
and use the vector notation: 
DM, = Je (_YeM+E-N). (221 a) 
2mic 


Similarly we get from (219) and (220a) the equation 
DN, = &.(-H-N—EM). (221 b) 
=-711C 
With ec = 0 these equations can be satisfied zdentically if we put 
M, = N,== 0. In the general case, however, the scalar functions M, 
and NV, must be different from zero. 
If we apply the operator u, to the first equation (219) and interchange 
the order of the different operators u, we get, taking account of (218 a), 


bie (E, M,—E.M,—E,,M,)+u, u,.M,—u,u,M,— 


—u,u,M,—uj N, = 0. 
Now by (219a) and (220): 


uy, M, = u,u,No—u,u, N,+uj N,, 
—uU,u,M, = —u,u, No+u? N,—u,u,N,, 
—u,u, My = uz N,+4,uU, N,4+-u,u, N,. 


By repeated application of the relations (218) and (218a), we thus 
obtain 


(ud + uj-+uf—uf)N,+ 
+5 (E, M,—E,M,—E,M,+H, N,—H,N,—H,N,) = 0. 


This equation and the two others which result from it by cyclic 
3595.6 Mm 


266 WAVE MECHANICS OF A SINGLE ELECTRON § 27 
interchange of the indices z, y, z can be summarized in the following 
vector equation: 


DN+ [Ex M—EM,)+(HXN—HN,)}=0. (222) 


Similarly, by application of the same method to equations (219a), we 
obtain the second vector equation, 


DM + i [Hx M—HM,)—(ExN—EN,)] =0.  (222a) 
aT 

Equations (221 a), (221 b), (222), and (222 a) are the required generaliza- 
tion of d’Alembert’s equation. They differ, however, from the latter, 
not only by the differential operators uw appearing in D, instead of 
= - etc., but also by additional terms which are proportional to the 
wit 
electromagnetic field components and which for each equation have 
a special form. 

If we omit these additional terms (whose physical meaning will be 
explained later) we obtain, for all the eight functions M,,..., N,, My, No, 
identical equations of the d’Alembert type—equations which differ 
from the relativity wave equation (201 a) or (217a) found earlier only 
by the absence of the ‘mass term’ m?c? in the operator Dy. This shows 
that the second step of our generalization of Maxwell’s equations—in 
so far as it is a question of the resulting generalized d’ Alembert equa- 
tions—must consist in replacing the operator D, by the operator intro- 
duced earlier, namely, 

D = Dot-mic?. (223) 

The corresponding introduction of the parameter m,c into the equa- 
tions of the first order (219) to (220) is done most simply as follows: 
In equations (219) and (219a), which contain the time derivatives of 
the quantities N,, N,, N,, Nj, we replace the operator u, by 

U, = U—MoC, (223 a) 
and in equations (219a) and (220) by 
Uz = U%-+mMgC. (223 b) 
Taking into account the relation u,u; = uju; = uj—mjc*, we can easily 
convince ourselves that from these generalized Maxwell’s equations 
u, M,—u, M, —u,N, = uz, MU, 
u,M,—u, M, —u,N, = u,M, (224) 
u, M,—u, M,—u,N, = u,M, 


§ 27 GENERALIZATION OF MAXWELL’S THEORY OF LIGHT 267 
u, N, —u,N, +u;M, = u,N, 


u, N, —u,N, +u;M, = u,N, (224 a) 
u,N,—u,N,+uM, =u,N, 

u,N,+u,N,+u,N, = —ujM, (225) 
u,M,+u,M,+u,M, = +u,N,, (225 a) 


there follow the generalized d’ Alembert's equations: 


DMy+ 5" (H-M—E-N) = 0 


i‘ (226) 
DN, + <-;. (H-N+E-M) = 0 
2rric 
DM+ (i x M—HM,)—(Ex N—EN,)] = 0 
pA (226) 
DN + snigl(Ex M—EM,)+(Hx N—HN,)] = 0 
Equations (226) and (2262) become identical if we put either 
N = iM, N, = 1M, (227) 
Thereby they assume the following simple form: 
DM, +." (HFiE)-M = 0, (228) 
2m1¢ 
DM + 9S (HFE) x M—(HFiE)M,] = 0. (228) 
TU 


Let that solution of these equations which corresponds to the upper 
sign be denoted by M+ and the other by M™. The general solution of 
equations (226) and (226a), therefore, can obviously be written in the 


form = =M-=c,Mt+c,M-, M,=¢,Mf+cM> 


: »  (228b) 
N = 1(c,M+—c,M), Ny = i(c, Mf—c, My) 


where c, and c, are two arbitrary constants (which must be introduced 
if the solutions M+ and M~ are normalized in some way). 

It must be mentioned, however, that the first-order equations (224)- 
(225 a) do not admit solutions of the type (227) and (227a), because of 
the appearance of the two different operators u, and u;. These solutions 
do not have, therefore, any real significance. 


268 WAVE MECHANICS OF A SINGLE ELECTRON § 28 


28. Alternative Form of the Wave Equations; Duplicity and 

Quadruplicity Phenomenon 
There is another possibility of halving the number both of the second- 
order equations (226)-(226a) and of the first-order equations (224)- 
(225 a), as well as of the wave functions M, N, defined by them. 

We must notice, first, that equations (224)-(225a) can be naturally 
regrouped by associating (225a) not with (224a), as has been done 
before, but with (224), and (225) with (224a). The two groups of four 
equations thus formed will be denoted by I’ and II” respectively. 

It is now easily seen that the equations of each group can be com- 
pounded in pairs and, as it were, folded up together, in such a way as 
to form two groups of two equations involving four unknown wave 
functions. Taking the group I’ we can, for example, compound the 
first two equations (224) to form one pair and the third with equation 
(225a) to form the second pair. If we multiply the first equation of 
the second pair by 7 and add it to the other, we get 


(u,—iuy)M, + (iu,+u,)M, +u,(M,—iMy)-+uj(—iN,—) 
= (u,—iu,)(M,+iM,)+u,(M,—iM,)+uj(—iN,—N,) = 0. 
Likewise we obtain, by subtracting the second equation (224) from the 
first equation multiplied by 7, 
(u,-+ iu,)M,+u,(—M,—-iM,)-+uj(—iN, + N,)—(iu,—uy)My 
= (u,+tu,)(M,—1M))—u(M,+iM,)+u(—iN,+N,) = 0. 
If we put, therefore, 
p, = M,+iM,, p, = M,—1M, 
“Ys = —1i(N,+iN,), f, = —(N,—1™) 


we can reduce the four equations under consideration to the following 


iit (u,— itty r+ Uy t uit, = 0 | 

(U,+ tty )p.—U,Y, +m ps = 0 
In a similar way the four equations of the group II”, (224a) and 
(225), can be folded up into the two equations 


(uz tly Wat Unde t mis = 0 |, (229b) 
(u,+ 10, Wg — Ue gt Uy Py = 0 
with the same four unknown wave functions (229). The equations 
(229a, b) were first derived by Dirac. 
The process just described can be applied to the second-order equa- 
tions which are obtained from (226) and (2264) by taking their com- 


(229) 


(229 a) 


§ 28 ALTERNATIVE FORM OF THE WAVE EQUATIONS 269 
ponents along the coordinate axes. We have, for instance, according 
to the first equation (22624), 


D(M,4-1M,)+ 
is Oe (((H, M,—H, M,--H, M,)+1(4. M,—H, LH, M,)| = 
“TTC 
_[(B, N,—E,N,—E,N,)+-i(B,N,—E, N,—E,N,)|} = 9, 
that is, 
D(M,+iM,) + = {(iH,(M,-+iM,)—i(H,+6H,)(M,—iM,)] — 


--[t#(N,+iN,)—-1(£,4-1F,)(N,—iNg)]} = 
and similarly, 


D(M, iM 
he ((H,M,—H,M,—H,M,)—i(H, M,+-H, M,+H,M,)] — 
—[(#,N,—E,N,—E,Ny)- u b.N,--E, N,+£,N.)]} san 0, 


Qric 


that is, 
D(M.—iM) 4+. sf - i(H,—iH,)\(M,+iM,)—iH,(M,—iM,)] — 
Dar 


—[—1(#,—71#,)(N,-+iN,)—tE,(N,—iN,)}} = 0, 
or, according to (229), 
he 


Df, car nae ~~ ALH: Yy— (H,+7H,)p.)—i[ EF, py —(E,+1F,)b)} = 0 


: : ; ; 
Diy + = {—[(H,— iH Wr +H a}+i[(E,—1B, hot BeWa}} = 0 
(230) 
In the same way the four remaining equations (226)-(226a) are folded 
up into 
he 


D$3+ met LE: p,—(L,+1#,) 'o]+ [He 43—(H,+7H,)p,]} = 0 


Dy 5% <{+i(E, —tBy)}, +E, p2]—[(A,—tHy a+ H.p4)} = 0 

(230 a) 
They can be derived from (230) if Y, and ¥, are replaced by ¢, and y,, 
and the latter by ¥, and ¥,. Both the equations (230) and (230a) can 
be obtained, of course, directly from equations (229a,b) in the same 
way as the equations (226)-(226a) are obtained from (224)-(225a), 
i.e. by the application of the operators wu to the left side of (229a, b). 
The latter equations were established by Dirac in an externally different 


270 WAVE MECHANICS OF A SINGLE ELECTRON § 28 
form and by a different method, which will be indicated later and which 
does not make use of the formal analogy between wave mechanics and 
the electromagnetic theory of light. We shall see that this analogy is 
actually not so deep as it seems at first sight, and that the regrouping 
of the equations (224)-(225a), which is necessary for their folding up 
into the Dirac equations, is a formal expression of a drastic divergence 
between the wave- RCH functions M, N and the electromagnetic 
functions H, Lh. 

It is interesting to notice that a similar regrouping and folding up 
can be carried out with regard to Maxwell’s equations. These ‘dis- 
guised’ Maxwell’s equations can be obtained from equations (229a) 
and (229b) by putting e = m, = 0, and further by replacing the vectors 
M and N in the definition (229) of the functions ¢ by H and E, 
dropping the terms MM), N,. 

In fact, it can be directly verified that if we put 

H,+1H, = yy, H, == 

—1(H,+1E,) = By, “a, Was 

we obtain, instead of the eight equations (215)-(216a), the following 
four equations: 


(231) 


re) a) e la } 
ig ee -4--- = | 
(= Pal ' gz " ¢ Ys 
a) 0 a la 
ta = + — = 0 
(= ‘i is) oz , ve at Ps 
a .6 a 1a ene 
. ¢ 
a ee ue —— == Q 
(= as taba & ate 
0 1 oa 
= 0 
(= + A \ 5 gists ¢ aM 


Another well-known possibility of reducing the eight Maxwell equations 
to four consists in combining the electric and the magnetic field 
strengths to form a complex vector 
K = H+iE. 
We then obtain, instead of (215)-(216a), four equations of a similar 
type, namely, 6a 
curlK + -—K = 0, divK = 0. 
c ot 

This method is not applicable to the generalized Maxwell equations 
(224)}-(225 a). 

The formulae (230) and (230a) correspond to the union of the 
variables z and y as well as of the corresponding components of various 


§ 28 ALTERNATIVE FORM OF THE WAVE EQUATIONS 271 
real vectors to form complex quantities w = x+-ty, H,+1H, = yy, 
E,+1E, = up,, etc. The operators 0/éx—i0/cy and 6/dx+10/dy can 
thereby be regarded as the differential operators 0/@w and 0/éw* 
corresponding to the complex variable w and the complex conjugate 
variable w* = x—1y respectively. 

While we can regard the formulae (230) as a decomposition of the 
complex functions y,,..., ~, into real and imaginary parts, this is not 
so in the case of the analogous formulae (229). The fact that all the 
eight quantities M@, N must in general assume complex values follows 
immediately from the complex nature of the operators u in the equa- 
tions (224)-(225a) determining them. 

The reduction of these eight equations to the four equations (229 a, b) 
is, therefore, an actual halving of the number of unknowns, while in the 
case of the Maxwell equations we have simply a union of real quantities 
—as the components of the electromagnetic field are—to form complex 
quantities. 

If the four complex quantities y,,..., 4, actually suffice for the com- 
plete determination of the electron waves it must be possible by means 
of these functions to express the statistical quantities, i.e. the pro- 
bability density p and the components of the probability current density 
jx Jy Je which we have determined earlier by means of the scalar yp. 
In the new determination of these quantities we shall at first be guided 
by the same analogy as that which led us to the generalized Maxwell 
equations—or to the Dirac equations equivalent to them. From this 
point of view the quantities p and j must correspond respectively to 
the electromagnetic energy density 


= —(BP4-H?) 
87 
and the energy-current density (i.e. to Poynting’s vector) 
¢ 
J = 7 EXH. 


If we put here, instead of the components of E and H, their expres- 
sions obtained from (231): 


H,=Vhtv!), Hy=ph-¥) ==, 
E, = * s—¥t) E, = $(¥3+¥3), E, = tb, —= — ut, 


weobtain = P= (ivf tiv tvavt+Heyt)s 


272 WAVE MECHANICS OF A SINGLE ELECTRON § 28 
and further, 


J, == (By H,— E, Hy) = [s+ 0 We —-¥)] 


= sb +ohbe tot daw), 


and similar formulae for J, and J,. 

These quadratic expressions are clearly real and also remain real 
when all the four quantities y,,..., ~, are complex. We arc led, there- 
fore, to use them for the representation of the quantities p and j. 
Omitting the common factor 1/87, we obtain 


p= Wi bt+ipe PT ty 5 ty pt (232) 
Js aie c(i, Pi -b pt + ho PF 4-3 PF) 
Mr = —1C(h, bi —a bl +o. by —yg Pt) ). (232 a) 


Je = CD BS os OT + bade thy Pt) 
If these expressions are correct they must, like the expressions obtained 
earlier for p and j, satisfy the equation 


“p 1 Gs) v1 Gz. 9 232 b 
te tx: cy a: 


expressing the law of the conservation of probability (or of the number 
of copies). It can easily be shown by means of equations (229a, b) that 
this is indeed the case. 
Multiplying these equations successively by ~*, Yf, pF, pf, sub- 

tracting from them the corresponding conjugate equations 

(ut int yt-+atyt +uitpt = 0, 

(us — ius yt —utyt +-uinpt = 0, 
etc., multiplied by y,, #3, etc., and finally adding the results, we get: 


[Yo (up — rey Py — Py (we — my be) + [PS (Web iy hs — Yro(ue bmg hs) 
+ [YE (U,— tty rg — paws — tmp br )-+ [PT (ut ty ha — Pa (uz + ty et) + 
+ (Pf ee, oo— Yo UP YT) — (PFU, by — py UTPy) + (PIU, fy— Py UTZ) — 
— (PT tt, pe— Pg UPYT) + (Pour by— Py wT PE) + (by as ba— bs wT YS) + 
+ (YE eh — Po Up FYE) + (PPui, —y, Uy *YT) = 0, 
which, by the definition of the operators u, easily reduces to (232 b) with 
the expressions (232) and (232a) for p and j,, j,, j;- 
Formula (232) is the immediate generalization of the formula p = yub* 
of the original non-relativistic Schrédinger theory. On the other hand, 


§28 ALTERNATIVE FORM OF THE WAVE EQUATIONS 273 
the expressions (232a) have a form entirely different from the original 
expressions for the current density 

‘ h Op O* 

of nee oie esse ess 2 ~—- rs 

Je ia dx We) 
A more accurate investigation of equations (229a,b) shows, however, 
that this difference is not so great as it seems. With harmonically 
vibrating waves, corresponding to a motion with a definite energy «, 
the dependence of the functions y,,..., 4, on the time is described by the 


common factor e-7«*, so that the operators u; and uj reduce to the 
ordinary factors 


u, = 2 (e—U+-mge*) = —=(W—U+2ingc*) 


; , (233) 
uu, = — o(e—U—mgc’) | =i) 


where U = e¢ is the potential energy of the electron and }¥ —U is its 
kinetic energy. In general (so far as we restrict ourselves to positive 
values of e, see below), the first factor is enormously large compared 
with the second; therefore the functions ¢, and y, which are multiplied 
by it in equations (229a) must, with regard to their absolute magni- 
tude, be very small compared with the functions ¥, and y,. If, more- 
over, we restrict ourselves to the case of motion with a kinetic energy 
W—U, which is small compared with the rest-energy m,c?, i.e. with 
a velocity v whose square is small compared with c?, we can put 
approximately, according to (229a), 


2m CyHz = (u,+ Uy )P.—U, py 

2m Ci, = (u,— uy )Yby+ U, Pe 
Since these relations no longer contain the energy ¢«, they may be 
regarded as approximately valid in the gencral case of non-conservative 
motion. 

It should be mentioned that, according to (233a), the ratio of the 
functions 5, 4, to the functions ¥,, 2 is of the order of magnitude 
g/(myc) = v/c, where v is the velocity of the electron, and g is its proper 
momentum estimated roughly by the ratio wh, ./,.. It follows from 
this that, to the first approximation with regard to small quantities of 
the order v/c, we can put, instead of (232), 


= Pb, tory, (234) 


neglecting the squares of ¥, and y,. Substituting the expressions 
3595.6 Nn 


(233 a) 


274 WAVE MECHANICS OF A SINGLE ELECTRON § 28 
(233 a) in (2324), we get further, 


ja ™ = {[YI(uz— ty +o euy ba) + [Yr (uz ou ot +, ur yz )+ 
+[p? (u,+ tu, )yo— ou, 2H )+[p,(ut —tus os +e usyy } 
= spp, Mt ueta tote da) + haut dt toa wtvt)+ 
Hal (Pz uy Yo— pray py) — (pa wpys — yp, upyt)]+ 
+[ (piu, Y2—pru, t)+ (yp, usps —po usp?) }}. 
If we put A, = A, = A, = 0 (i.e. if we neglect the potential momentum 


—if any—compared with the proper momentum), we obtain the fol- 
lowing formula: 


we (br da— Pity) + = S (Utte— 24) 


the first term of which (in square brackets) is the same generalization 
of the original expression 


i= Ga(M as bv) 


4mm, 1 


as (234) is of the original expression %* for p. The physical meaning 
of the two additional terms will be cleared up in the next section. From 
the purely formal point of view, these two terms, as well as the corre- 
sponding terms in j, and j,, can be regarded as the x-, y-, and z-com- 
ponents of the curl of a certain vector cM, oe by the formulae 
Me = Wiha e), Dy = Wait) 


4 
7M Ct (234.8) 


h 
wT, = tgs TY) 
so that the approximate expression for the current density in vector 
form is: 


h 
j= j-—— (IVb, + UVa Vo ya Vy) +ecurlm. (234) 
0 
If, further, we substitute the approximate expressions (233a) in 
equations (229b), the latter assume the following form: 
(u,—?Uy)(U,+ iu, yy—(U,—1u,)U, b+ U,(U,— Uy b+ 
+ Us pet 2m cur, = 0, 
(u, + iuy) (u,— ty yy oa (u, 2 i uy ju, po— (Up+ tty bet 
+UZ p+ 2my cup, = 0. 


§ 28 ALTERNATIVE FORM OF THE WAVE EQUATIONS 275 
Now according to the relations (218) and (218a), we have 


(u,—tu,)(u,+tu,) = ui+uptu,u,—u,u,) = ui+ur—. oH, 


(up iuy)(u,—iuy) = utah + 2 H,, 
u,(U,—iu,)—(u,—tU, Ju, = we (- H,--1H,), 


‘ ‘ he 
(U,+-1Uy)u.—U,(U,+-1Uu,) = — 5 (Het HH,). 
We have further 
2m,cur == 2 2 
My CU; mol (5 ; ate) time | 
which reduces to --2m(I!/—U) for conservative motion. We can drop 
the constant term m,c* if — ‘ ; 7 is assumed to reducc in the latter case 
2a 
to —- 1’ and not to —e (this constant term entails an irrelevant factor 


e-i27moc'lh in the expression of the functions y¥,, ¥,). With this condition, 
the preceding equations can . written down in the following form: 


(<u +0) b+ gore ~- [Ht —(H, + 1H, Wa] = 0 


(surat) dot = H,—iH,\,—H, 44] = 0 
where u is the (three-dimensional) vector with the components w,, u,,, u.. 

These equations represent the approximate form of the relativistic 
second-order equations (230) and can indeed be obtained from them 
by dropping the small quantities 4, ¥,. The approximation involved 
corresponds to neglecting terms of the second and higher orders in v/c, 
including those which represent the variation of mass with velocity. 
It must be mentioned that, although the functions ¥,, y, are themselves 
small of the first order with regard to y,, %,, they are multiplied, in 
equations (230), by the factor he/27c, which can be regarded as a small 
quantity of the first order (in 1/c). 

If, in equations (235), we drop the additional terms, proportional to 
the magnetic intensity (putting either H = 0 or c = 0), they reduce 
to equation (23a), § 26, the two functions y, and ¢%, becoming iden- 
tical with the single function y% of the previous theory. Equations 
(235) thus give a more complete description of the motion than equation 
(203). In fact they exhibit the duplicity phenomenon which has already 
been indicated’ in Part I, § 19, and traced to the electron’s ‘spin’ or 


276 WAVE MECHANICS OF A SINGLE ELECTRON § 28 
‘intrinsic magnetic moment’. To these properties correspond additional 
forces, which are represented by the additional terms, proportional to 
the magnetic field in equation (235), and also to the electric field in the 
exact equations (230). 

The duplicity phenomenon, as explained in Part I, in its simplest 
form consists in the splitting-up of each quantized state, as determined 
by Bohr’s theory, into two states which in general have slightly 
different energies. So far as the number of states is concerned, Bohr’s 
theory gives the same results as the ordinary Schrédinger equation with 
one wave function ¥. Now to each solution of this equation, %, say, 
there corresponds a set of two solutions of the system of equations (235) 
or rather of the equations obtained from them, if the operator —p, is 
replaced by the energy constant. 

This means that to each energy-value H’ of the ordinary Schrédinger 
equation there correspond two slightly different energy values, H', and 
H'_ say, of the system of equations (235). Each of these energy values 
is associated with a set of two functions 437., oy. aNd Pyy_, Woy; 
these four functions replace the single function y,,, of the Schrédinger 
theory. 

If, instead of the approximate equations (235), we take the system 
of four exact equations (230) and (230a), then by a similar argument 
it seems to follow that to each state of the ordinary Schridinger theory 
there corresponds, according to the exact theory, four states, whose 
energics, if the magnetic and electric field strengths are not too large, 
lie close to the energy H’ of the single Schrédinger state. 

This conclusion is, however, fallacious, for the four second-order 
equations (230)-(230a) are not independent of each other, being in fact 
derived from the four first-order equations (229a)-(229b).- So far as 
the number of solutions (i.e. states) is concerned, the latter are equi- 
valent to two of the four second-order equations derived from them. 
We get, therefore, with the exact equations (230)-(230a), a duplicity 
phenomenon of the same type as with the approximate equations (235), 
the value of the energy being, of course, somewhat different in the 
exact theory from what it is in the approximate theory. 

The exact theory, when compared with the approximate theory or 
with the original non-relativistic Schrédinger theory, leads, however, 
to an additional duplicity phenomenon of an entirely different type, 
which is not connected with the ‘spin’ property, but can be referred 
to as due to the variation of the mass with velocity. This type of 
duplicity is already implied in the relativistic equation with the single 


§ 28 ALTERNATIVE FORM OF THE WAVE EQUATIONS 277 
function %, which was derived at the beginning of this chaptcr. We 
come upon it in its simplest form in the case of free motion, when the 
operators u,, Uy, U,, u,can be replaced by ordinary numbers (multipliers) 
J» Jy» Jz» €/¢ representing respectively the components of the momentum 
and the energy, including the rest-energy m,c”, divided by c. Equations 
(230) reduce in this case to the same form as equation (196), namely, 


2 
(o:-+95-+-98—S+mier)y = 0, 


which is equivalent to the ordinary relativistic relation between momen- 
tum and energy 2 
g—stmye = 0. 


Now since this relation contains the square of the energy, it leads to 
two numerically equal values of the latter, one positive and the other 
oe c= beylmpet+9?). 

In Einstein's mechanics, the negative value was rejected as having 
no physical meaning. ]t has, however, been explained already in Part J, 
§ 19, that this rejection is not justified in wave mechanics, because of 
the possibility of a continuous transition from a state of positive to 
that of negative encrgy « through imaginary values of the velocity or 
because of a ‘jump’ produced by some perturbing forces. 

In the case of non-relativistic wave mechanics, we have, under the 
same conditions (free motion), 

g?—2m, W = 0, 
where JV is the ordinary (kinetic) energy, not including the rest-energy 
m,c*. This non-relativistic energy is related to the positive energy « of 
relativity mechanics by the equation 

W = e—m Cc’, 

whereas the negative energy « has no counterpart in non-relativity 
mechanics. The appearance of the negative energy « in addition to the 
positive energy forms the essence of the duplicity phenomenon of the 
second kind. The situation is not substantially changed in the general 
case of motion in a conservative field of force, the only difference being 
that the positive and negative energies of the corresponding states arc 
not numcrically equal. 

Combining the two duplicity phenomena—that due to the spin and that 
due to the relativistic variation of the mass—we get a quadruplicity pheno- 
menon which can conveniently (though not quite correctly) be associ- 
ated with the replacement of the single ¥-function of the Schrédinger 


278 WAVE MECHANICS OF A SINGLE ELECTRON § 28 
theory by the four %-functions of Dirac’s theory.—This association 
is not quite correct, for the same quadruplicity phenomenon would 
result from Pauli's theory, based on the use of two functions y%, and p.2, 
if, in the approximate equations (235) defining them, the non-relativistic 
operator u?/2m)+p,+ U were replaced by the corresponding relativistic 
operator of the second order, D = (u?—uj?+mijc*)/2m). It must be 
mentioned, however, that in doing this we should be guilty of incon- 
sistency, because, having dropped additional terms of the second order 
proportional to the electric field strength in deriving the approximate 
equations (235) from the exact equations (230), we must also drop 
second- and higher-order terms, representing the dependence of mass 
upon velocity, in the main operator D. 
In the case of free motion (represented hy plane waves), there exists 
a very simple relation between the four functions % referring to the 
positive energy and the corresponding negative energy solution of the 
Dirac equations (299a)-(299b). Putting 
by =a, ef2rYs Et Dy V4 =—eb)Ih (236) 


where the a, are constants (k = 1, 2,3,4), we can replace them by the 
following algebraic system: 


. ] 
(J2— 2 y)4y+G2 42+ a let Mo \a,22'0 
; ’ 236 a) 
(Get ty)la— 9.44 + =(c mpc? )ty = 0 


] ‘ 
(J2—*9y)A3-+ 9244 « (e—mgc*)a, = 0 
(236 b) 


I 
© 


: ] a 
(Jr+19y)4y—Y- Ag+ = (e—mgc*)a, = 


lf, in these equations, the energy « is replaced by —e, then the first 
two become identical with the second two and the latter with the 
former if simultaneously a,, a, are replaced by a3, a, and a3, a, by 
—da,, —@,. This means that, with 


$, = oi, ip. = i, bs = py, Py = Wi, 
corresponding to « = «’ > 0, we have 
“py = hy, t. = Wi, ps = —yp,, py = —, 
for « = —e’. 
It has been assumed, hitherto, that the functions 5, 4, were small 
(of the first order in v/c) compared with y,, Y,. We now sce that this 
is only true if we restrict ourselves to positive cnergy solutions; the 


§ 28 ALTERNATIVE FORM OF THE WAVE EQUATIONS 279 
converse is true in the case of negative energy solutions—both for free 
motion and for a motion in a coriservative field of force. 

From the point of view of the old relativity mechanics, the reversal 
of the sign of the energy ¢ = c*m,/,/(1—v?/c?) is equivalent to the reversal 
of the sign of the rest-mass m,. This is not exactly true, however, in the 
wave-mechanical theory. For a reversal of the sign of m, in equations 
(236 a)—(236 b) leads to the replacement of %,,%. by Ys, , and ws, yy by 
yb, _ without reversal of the sign of the latter. The two solutions have 
nothing to do with each other, since they refer to particles of different 
kinds (particles with negative rest-mass being in reality non-existent), 
whereas the two solutions corresponding to « = +e’ refer to the same 
particle with a positive rest-mass my, the values of the energy being 
due to the ambiguity of sign in the radical of the expression 
€ = C*m/,/(1—v?/c?). 

It is important to notice that the states of negative energy, as deter- 
mined by relativity wave mechanics, are not directly observable. Accord- 
ing to Dirac’s theory of the duality of matter and electricity, outlined 
in Part I, § 19, nearly all these states are occupied by electrons, the 
vacant states (‘holes’) being observed as protons. According to the 
revised version of this theory, the holes in question represent not 
protons but positive electrons, which have been recently discovered by 
Anderson in America (1932) and by Blackett and Occhialini in 
England (1933). 


29. The Approximate Pauli Theory in the Two-dimensional 
Matrix Form; Electron’s Magnetic Moment and Angular 
Momentum 

The approximate (non-relativistic) equations (235) were initially ob- 

tained by W. Pauli in 1927, not as an approximation to the Dirac 

theory, which was published a year later, but as the result of a semi- 
empirical attempt to interpret wave-mechanically the duplicity pheno- 
menon, which a year before had been incorporated by Uhlenbeck and 

Goudsmit into the Bohr theory on the assumption that the electron 

possesses a spin motion, with an angular momentum equal to half of 

the Bohr unit 4/27 and a magnetic moment equal to Bohr’s magneton 
p= eh/(4rmgc). 

Pauli’s equations (235) can actually be put in a form corresponding 
to this assumption, i.e. giving a wave-mechanical interpretation of the 
electron’s ‘spin’, and, indeed, by using a matrix notation, based upon 
the representation of the two functions ¢,, ,. as the elements of a one- 


280 WAVE MECHANICS OF A SINGLE ELECTRON § 29 
column matrix 


v= (ih Ja 
the conjugate complex functions ¥*, Y* forming the adjoint one-row 
ne Wt = Wada}: (237) 
Under this condition, the two equations (235) can be written in the form 
Py == 0, (238) 

where 7? is a square ‘operator-matrix’ of the second rank 
P= lea Pal ~ 


with suitably defined elements. These elements must be defined in such 
a way that the two equations (235) assume the form 
(Py), = Pry tPreye, = 0 \ (238 b) 
(Pp)o == Psy by tPrope = 0 J 
Hence it follows that : 


1 . ; 
P= Sah eet U7)8— pH -e, (239) 
where S f if (239) 
0] 


is the unit matrix of the second rank and @ is a vector matrix with 
the following rectangular components: 


0 ] 0 2 —] 0 ‘ 
=— —— oman . vp ) 
i ( 0) 7 = i) = | 0 2 as. 


The scalar product H-o denotes, as usual, the sum H,¢,+-H,o,+ 1, ¢;. 
This is a matrix with the elements 

(H:o),, = —H,, (H-o),. = H,+1H,) 

(H-9),, _ 1s 1H, (H-0)2. aaa +H, J 
The matrix o was introduced by Pauli for the wave-mechanical repre- 
sentation of the clectron’s magnetic moment which was supposed to be 
due to its spin. This ‘intrinsic’ magnetic moment can he defined as the 
operator or matrix 


(239 c) 


= pO, 
where yp = eh/(47m,c) is the value of the Bohr magneton. 
The reason for this is that equation (238) can be written in the 


usual form (K+-p)b = 0 (240) 
if p, is defined as the matrix-operator 
geet! (240 a) 


Qmi ot’ 


§28 PAULI THEORY 1N THE TWO-DIMENSIONAL MATRIX FORM 281 
and K as the energy matrix-operator 

K = (uty U) se. (240 b) 
the additional term —y-H having exactly the same form as the energy 
of an elementary magnet with a moment p in the given external 
magnetic field H. 

We thus sec that the generalization of the Schrédinger theory which 
is necessary to account for the spin phenomenon consists in adding to 
the energy operator the extra term —p-H and in replacing ordinary 
operators by operator-matrices of the second rank, the function % being 
replaced accordingly by the one-column matrix (237). The old operators 
of the Schrédinger theory, such as . u?+U and x Zz , are replaced 

2m 2m ct 
by their products with the unit matrix of the second rank 8. 

In future we shall usually omit the unit matrix, its presence as a 
factor being understood whenever we have to deal with an ordinary 
operator—like u? or U, etc.—of the old theory. With this convention, 
the old theory can be preserved without any change of form whatsoever 
—except for the addition of the extra term —y-H to the energy operator 
and the corresponding modification of other expressions connected with 
the resulting operator K. 

Thus, for instance, if the characteristic values of AK, which will be 
denoted by K’, K”, etc., as before, are imagined to be multiplied by 
the unit matrix 5, we may write, omitting the latter, in the same way 


as in the old theory: (K—K'\by. = 0 (241) 


which is actually equivalent to the system of equations 


(Ay — A ben t+ Kiba, = 0 . 
Kate rt+(Ke2—K xe = 0 

It should be mentioned that Schridinger’s theory can be regarded 

as a particular (or rather limiting) case of Pauli’s theory, obtained by 

putting » = 0, i.e. by dropping the extra term —y-H in K, but pre- 

serving the matrix form of the resulting operator H, which can be 


(241 a) 


defined as the product of the ordinary operator oo ut+ U and the unit 
0 


matrix 5. The two functions ¥, and y%, become identical in this case 
except for a constant factor ~,/,, which remains arbitrary, and which, 
without loss of generality, can be put equal to zero, the function y, 


thus vanishing and #, reducing to the ordinary Schrédinger function y. 
$595.6 00 


282 WAVE MECHANICS OF A SINGLE ELECTRON § 29 
Before proceeding further, we must consider the equation which is 
satisfied by the function-matrix ¥%', adjoint to y. 
The conjugate complex of equation (240) satisfied by ¥* is 


(K*+ pf )y* = 0, (242) 
where ft = n 


is the conjugate complex of 4%. We shall not, however, in future need 
this matrix, but the transposed matrix ¢! = {J*, ~3}. If the matrix 
elements of A and p, were ordinary numbers (and not operators), we 
could, instead of the preceding equation, write 

p'(Kt+p?) = 0. (243) 
We shall preserve this equation in the general case, with the convention 
that the operators At and p{—contrary to the rule assumed hitherto— 
act not on their right but on their left. The same refers to matrix 
operators of any type. Thus, if 


Fi, F 
en | TT * 
Fy Fy 
is a matrix operator acting on y and Fy the one-column matrix 


Fy= ey 
Fa tit Frode 
resulting therefrom, then the adjoint matrix (F)' will be defined by 
yi Ft — {pt F},+y9! F},,¥} Fito FL 
= {FR P+ PRY, FS ot t+ FR yy}, 

which is in accordance with the usual definition of adjoint matrices. 
The necessity for reversing the direction of the action of an operator 
from right to left in a transition from F to F' is due to the fact that 
yt, being a one-row matrix, must always stand as the first factor in 
a matrix product involving it (while ~, being defined as a one-column 
matrix, must always stand in the second place). 

With this convention, the equation for the matrix-function ¥}. can 


be written in the form 
pi.(Kt—K'l) = 0 
or, since K’t = K’, ph. (Kt—K’) = 0. (243 a) 
This is equivalent to the ordinary equations 
thy (K},—K')+bhes Kj, = (Ki —K etki = 0, 
Yer Khtbied(K—K’) = Kivi t (KR—K Wis = 0, 
which are the conjugate complex of the equations (241 a) (K’ being real). 


§29 PAULI THEORY IN THE TWO-DIMENSIONAL MATRIX FORM 283 
The product of the matrices ¥* and y is a matrix consisting of one 
row and one column only; it can be treated accordingly as a simple 


number. This number 
Py = Pip, + Pre (244) 


can also be regarded as the scalar product of the two-componcnt vectors 
y and * (or yt). It measures, as we know, the probability-density for 
finding the electron at a given point in a state of motion specified by 
the matrix or vector y. If the latter is ‘quadratically integrable’, i.e. if 
the integral { ty dV extended over the whole space converges, then 
ys can be normalized by setting this integral equal to 1. This refers, in 
particular, to functions y,- belonging to a discrete energy spectrum, 
in which case we can put 


[ Phebe dV = 1. (244.a) 


It can in addition easily be shown in practically the same way as in 
the old theory that functions y belonging to different energy values, 
Kk’ and K” say, satisfy the orthogonality relation 


J Ykede dV = 0 (K’ # K"), (244 b) 


where Pie Pe = Prateitvee vas 
is the product of the matrices (or vectors) ¥j-- and px. 

We have in fact, multiplying the equation (A—K’)p,.. =: 0 (on the 
left) by %f- and the equation %}.(K—K”) = 0 (on the right) by ,- 
and subtracting one from the other, 

Uh(Kie)—Wk-e KW = (K'-K"Whede. (244) 

The two sides of this equation can be considered as ordinary numbers. 
lf K were not a differential operator but an ordinary matrix of 
Hermitian character, i.e. satisfying the condition K,, = Kj, = Ki, or 
kK = Kt, then the left side of (244c) would vanish identically. In 
reality, the matrix K, as defined by formula (240b), has two component 
parts of the above type—namely, the potential energy Ud and the 
additional magnetic energy —p-H = —po-H. In fact, it can be directly 
seen from the expressions (239b) for the rectangular components of 
Pauli’s ‘spin matrix’ o that _ ¥ 
o' =a. (245) 


The left side of equation (244c) thus reduces to 


Sng HO) hu Wee] = gp, > Wheat iratarat ed) 


ei 
= div pa (Pr Ube t¥K gl big): 


284 WAVE MECHANICS OF A SINGLE ELECTRON § 29 
It should be mentioned that in the case ¢,. = py» we obtain under 
the div-sign an approximate expression for the current density j. [Cf. 
the derivation of the expressions (201 b) in § 25.] 

Multiplying equation (244c) by the volume-clement dV and in- 
tegrating over all space, we thus get 


(K’—K") [Phebe dV = 0, 


whence the orthogonality relation (244 b) follows, unless A’ =; A". The 
case of degeneracy,i.e.p, ~ b,-when K’ = K", can be dealt with in the 
new theory in exactly the same way as in the old theory, the Schrédinger 
‘scalar’ function % being replaced by the Pauli two-component vector 
(or matrix) ~. 

The present theory in the above form is a combination of the ordinary 
operator theory and the matrix theory, as developed in the preceding 
chapters on the basis of Schrédinger’s equation. It can be reduced, 
however, to the usual matrix form by introducing the matrix-com- 
ponents of the various (two-dimensional) operators F by means of the 


formula ; ' 
, { ph. Fo, AV, (246) 


where Yi Fox = p> hea Fup bxep (246) 
a=l ~l 


is an ordinary number (the ‘scalar product’ of the two-dimensional 
vectors #}-- and Fy,.; the latter can be regarded as the product of 
the vector 4, and the two-dimensional ‘tensor’ F). 
Replacing the functions J, by their ‘amplitudes’ 9°... with which 
they are connected by the same relation 
hace = W(x, y, z)etea Kun, 
as in the Schrédinger theory, we obtain the matrix-clements of F 
Poig = [ oR Fy. dV. 
They are connected with the matrix-components by the usual relations 
Figege = Phage ete MM, (246 b) 
All the theorems which have been established in Chap. III with regard 
to the matrix representation of physical quantities ‘from the point of 
view’ of the energy K, remain valid if the latter, as well as the operators 
representing other physical quantities, are defined as two-dimensional 
tensors (or square matrices of the second rank). We have, for instance, 
che usual expansion formula 


Pie = & Plex tio (247) 


§29 PAULI THEORY IN THE TWO-DIMENSIONAL MATRIX FORM 285 
which is a direct consequence of the orthogonality and normalizing 
relations for the vector-functions ¥%-. and which is equivalent to the 
following two component-equations: 


2 Fg Yip = D2 Foeg Qing (x = 1,2), (247 a) 
=1 
that is, , 7 1 
da Putative = p Phew Pee 
Fy Vet Foe Vion = p> Pre-K Pera: 


The transformation theory, i.e. the transformation of the matrices of 
various physical quantities from the point of view of A (original energy 
matrix) to the point of view of some other quantity.Z, as developed 
in Chap. IV on the basis of Schridinger’s ‘one-dimensional’ theory, can 
be applied without any forma] modification to Pauli’s two-dimensional 
theory. Introducing the transformation coefficients a,-,., we have, for 
example, the usual equation 


p Axe Phe (248) 
which is equivalent to the two equations 
i ss 2 Ayrw yp: Pomme (x = ’ 2): (248 a.) 


To make the result expressed by these transformation equations unam- 
biguous, we must affix to the functions y the index x (short for x, y, z, 
i.e. the rectangular coordinates of the point to which these functions 


refer). We thus get es > ae ee (248 b) 


This equation clearly shows that the index a (which is supposed to 
assume the two values | and 2) plays exactly the same role as the space 
coordinates x,y,z. It can be considered accordingly as an additional 
‘fourth’ coordinate, which is usually referred to as the ‘spin coordinate’. 
With this condition, the two functions ¥,(z,y,z) and (x,y,z), forming 
the components of the Pauli vector (or matrix) %, can be considered 
as the two values of the same function (a,x, y,z) referring to the same 
values of x, y, z and to the two different values « = 1 and « = 2 of the 
spin coordinate. The addition of the latter to the usual three co- 
ordinates x, y, z enables one to reduce the two-dimensional Pauli theory 
to the old uni-dimensional form—with one modification only concerning 
the operators F,,,) as defined ‘from the point of view’ of the basic 
quantities «,2,y,z. These operators can be defined as ordinary functions 
of the continuously variable quantities z,y,z and of the elementary 


286 WAVE MECHANICS OF A SINGLE ELECTRON § 29 

h o _h a h 
Dai ax? Pv ~ Sai dy’ P* ~ Im d 
however, be defined as matrices with regard to the Sipgeets variable a. 
In fact, the result of the application of an operator F to a function 
of the type y(a,z,y,z) must be another function of the same type 
#(B8, x,y,z), referring to the same values of x, y, z but not necessarily to 
the same value of «. Assuming £ to be independent of a, we see that the 
most general type of linear operator satisfying the condition 


Fi(a,x) = $(B, x) 
can be defined by putting 


Fo, x) = Fre scx, 2), 


where the Fy, are ordinary operators involving the space coordinates 
only. 

It is possible and sometimes convenient to modify the preceding 
notation in the opposite way, namely, by preserving a as a duplicity 
index and introducing similar indices for the two values of all the other 
quantities which are derived from a single value through the action of 
the spin term —yo-H in the energy operator K. This refers in the first 
place to the characteristic values of the energy itself. The two values 
of K’, which are obtained by the splitting up of a certain characteristic 
value of the Schrédinger energy operator H’ and which, in general, lic 
very close to each other, could be denoted by adding to one of them 
a subsidiary index, x say, assuming the two values 1 and 2, the com- 
bination (1, K’) being equivalent to K‘,, say, and (2, K’) being equivalent 
to K’_, where K‘, are the two values of A’ corresponding to the given 
value of H’. With this notation, the transformation equation (248 b) can 
be rewritten in the form 


2 
Dyer; a's’ = mee KK; NL Breer: a's’? 


where A” and L’ are the aan values of the energy operators A or L 
unperturbed by the spin term —p-H. 
From this point of view, the matrix components of an operator F’: 


differential operators p, = —; they must, 


Fern = | thee Poex a, (249) 
can be grouped together into two-dimensional matrices 
Tas | 1K*;1K") “1K wa (249) 
; Poxesix Faxes; ox 


which correspond to the ordinary components of the matrix F;, defined 


§29 PAULI THEORY IN THE TWO-DIMENSIONAL MATRIX FORM 287 
from the point of view of the Schrédinger energy operator A without 
the spin term. 

The matrix F,, considered in this way—i.e. as formed by elements 
which are themselves matrices—is called a ‘super-matrix’. 

We shall not consider the further development of these formal con- 
siderations. ‘The preceding outline will be sufficient for handling various 
problems connected with Pauli’s theory in any once of the three equi- 
valent forms, which have just been indicated. The simplest and most 
important of these problems is the approximate solution of Pauli’s 
equation, considering the spin term —yo-H as a small perturbation. 
The energy operator resulting from K by the omission of this term will 
be denoted by H; it is equal to the Schrédinger operator u?/(2m,))+ U 
multiplied by the two-dimensional unit matrix 5. In order to avoid 
confusion between this operator and the magnetic field strength, we 
shall denote the latter by §. 

The change of the energy values H’ produced by this perturbation 
can be calculated, to the first approximation, by means of the same 
equations as in the case of the Schrédinger perturbation theory. In 
doing this we must, however, keep in mind the fact that the unper- 
turbed problem is degenerate, each value of H’ corresponding to at least 
two different states. It is just this latent duplicity which must be 
revealed by taking into account the spin energy 

S == —psyo. (250) 
Assuming no other degeneracy to take place (or the matrix elements 
of the perturbation energy S with regard to other states of equal 
unperturbed energy to vanish), we obtain the following equation for 
the first-order correction A/I’ of the unperturbed energy 


1} toe , 1,2 
|" ra ia = 0, (250 a) 
S21 $22 AH’ 
where SA = Sean = | tt Stray aV, (250b) 


the indices «x, A (= 1, 2) specifying the two degenerate states in ques- 
tion. They are used as superscripts in the matrix elements of S in 
order to distinguish the latter from the matrix elements with regard 
to the spin-index 
Sap —_ — pS Sg = —p(D,z Srapt Dy Cyapt 9. Ozap): 

The two functions ¥,, (« = 1,2), or rather function-pairs ,.77°, a2 
(a = 1, 2) describing these degenerate states must be defined with the 
help of the ordinary Schrédinger function ¥,,,, = % in such a way as to 


288 WAVE MECHANICS OF A SINGLE ELECTRON § 29 
satisfy the orthogonality and normalizing relations. The simplest way 
to do this is to put 
Dir: in = Pipes Brn: er = 0 | (251) 
You’; 12 == 9, Pon'sor = Wii's J 
(supposing the function y,,, to be normalized). 
By the definition of the spin matrix o [cf. equations (239 b)| we have, 
dropping the indices #/’ and 2, 
(Sd), = Sy Part Sie dane = +p[ 9241 — (92+ 7D, ro], 
(Syy)o = Soy Prt Soe Pro = H[(— §,+-79,).-- 5. br]: 
In the present case these expressions reduce to 
(Syy)) = HD; Yo, (Syp;)o a B(- H,+-7,)y, 
for A = 1, and 
(Sy), = —p(H,+7H,)p, (Sipy)o = —pH.y 
for A = 2. We thus get, with the help of (250b) or 


Sr = sodA S.ptag dV: 
a B 


SM =p] o.ytpaV 

Si = —p [ (9, +15,)\%} dV 

S21 =: —p [ ($,—-i1h, yp a 

S22 = —p{ O.ypdV 
whence, according to (250a) 

(AH')? = (S41)4 | S12), 
since S*? <= —S}1 and S21 = S§)2*, or 
AH’ = +4/((S™)?+ [S12/3}. (251 b) 


This formula solves our problem so far as the splitting of the original 
‘unperturbed’ energy-level is concerned. The fact that the two sub- 
levels have an additional energy of the same magnitude and of opposite 
sign can be interpreted by assuming that the intrinsic magnetic moment 
of the electron has in both cases opposite orientations varying, in 
general, from one place to another according to the direction of the 
magnetic field. In the simplest case of a homogeneous field, the two 
orientations can be shown to be parallel to the latter. 


(251 a) 


§29 PAULI THEORY IN THE TWO-DIMENSIONAL MATRIX FORM 289 
We have, in fact, in this case 
SM = ph, [ pp dV = nG,, 
$1? = —p(H,+1H,), 
so that AH’ = +y$, (251 c) 
where § = ,{H2+57+2} is the magnitude of the magnetic field 
strength. This formula is in full agreement with the assumption that 
the electron has an intrinsic magnetic moment of magnitude p» (Bohr’s 
magneton), which in a homogeneous magnetic field is oriented either 
in the same or in the opposite direction to the magnetic lines of force. 
It can in addition casily be shown that, in the case under considera- 
tion, formula (251 ¢), which has been derived as the first approximation, 
holds exactly. 
For the sake of simplicity, we shall imagine the magnetic field to be 
parallel to the z-axis. Pauli’s equation then reduces to the form 


(H—pHo,—K' yp = 0, 
which is equivalent to the two equations (cf. (235)): 

(H+p5—K’)p, = 0, 

(H—pS—K' Wy = 0. 

If 4, is the solution of the Schriédinger equation (H—H')y,, = 0 
corresponding to the unsplit energy-level H’, then the solution of the 
preceding system can be put in the form 

(1) A°=H'+p$, = oy, = 9, 

(2) A’ = H’—p, $, = 0, b= dy: 
The first case obviously corresponds to an orientation in the direction 
opposite to that of the magnetic field, and the second to an orientation 
in a direction coinciding with it (i.e. in the direction of the positive 
z-axis). 

This indicates, incidentally, that the functions %, and y%, can be 
considered as the probability amplitudes for finding the electron at a 
given point with its intrinsic magnetic moment pointing in the negative 
and positive directions of the z-axis respectively. In the general case, 
both of them are different from zero. It is perfectly natural that, under 
this condition, the probability of finding the electron at a given point 
irrespective of tts orientation should be measured by the sum |, |*+ |p|?. 
We see, further, that the index « which distinguishes the two com- 
ponents of the ‘vector’ % fully deserves the title of a fourth ‘spin- 


coordinate’; it must be borne in mind, however, that it specifies not 
3595.6 Pp Pp 


290 WAVE MECHANICS OF A SINGLE ELECTRON § 29 
the orientation of the ‘spin’ or magnetic axis in space, but only its 
orientation in one of the two senses parallel to a given direction—namely, 
that of the z-axis. 

This interpretation is supported by the form of the expression for 
the average or probable value of the z-component of the electron’s 
magnetic moment, as defined in the usual way by the formula 


i. = f opp av. 


We have, namely, with w, = wo, and (0,4), = 021%, +02%2 = —Y. 
(o,))o = O21 Pit Ozo2 Pe = +2, 


fiz =» { (tto—vty,) av. (252) 

In a similar way we find 
fe, = wf (tbe t+vty,) dV 
jy = in [ Wh.—vty,) aV 
We thus see that the direct relation of the functions y, and y, to the 
orientation of the electron’s magnetic moment is limited to the z-axis. 
The two functions ¥*, and $3, have complex conjugate values, and 
cannot be associated with a definite direction of the electron’s moment 


parallel to the z- or to the y-axis. 
The quantities 


N, = BPS Pot+HFy,), M, — i (bt bo— 24), MN, = LPT p.—PTy,) 
(252 b) 


are the components of a certain vector Mt, which can be defined as the 
probable magnetization, i.e. the probable value per unit volume of the 
magnetic moment of the ‘electron cloud’ distributed with the density 
piv, tty, =p. The vector M/p can be regarded accordingly as de- 
fining, both with respect to magnitude and direction, the probable value 
of the intrinsic magnetic moment of the electron, supposed to be 
situated at a given point. The magnitude of ® must, of course, be 
expected to be equal toy. This is easily seen to be actually the case. 
We have in fact, 


M? = M+ M4M? = (M,+éM,)\(M,—iM,) + M? 
= ppp, Pot (ta Pe)? + (hs or )*— 2g boyy ] — LP ba +yTy,)?, 
so that M/p = pw. The unit vector M/up thus determines the probable 


direction of the electron’s moment at a given point. 
The physical meaning of the vector M is in agreement with the 


(252 a) 


§28 PAULI THEORY IN THE TWO-DIMENSIONAL MATRIX FORM 291 
expression c curl M in formula (234 a) for the additional current density 
(cf., for instance, my Lehrbuch der Elektrodynamtk, vol. ii, Chap. I). 

In contradistinction to the electron’s position, its orientation cannot 
be specified exactly, so that we must confine ourselves to the deter- 
mination of the probable orientation or of the probability of a certain 
orientation (under given circumstances). The formal reason for this 
difference is that the matrices y,, w,, “, OF o,, oy, 0,, whose charac- 
teristic values should specify the orientation in the same way as the 
values of the coordinates x, y, z specify the position, are not independent 
of each other. 

In fact, multiplying them according to the usual rule of matrix 
multiplication, we get 


0 1f 0 a 
ere) oe Olle OF 


If the multiplication is effected in the opposite order, the same results 
are obtained but with the opposite sign, so that 


Oyo, = —9,%,, 0,0, = —od,9;, 0,0, = —0,0;,. (253 a) 


These equations express the fact that the matrices o,, o,, o, do not 
commute with each other—in contradistinction to the coordinates z, y, z; 
according to Dirac’s terminology they are said to ‘anticommute’. Com- 
bining equations (253) and (2534), we get 


Oy—O,0, = 210, 


G; y 


z 


etc., or in vector notation 
oxo = 210. (253 b) 


The non-commutability of the matrices o,, o,, o, means that the 
values of the quantities represented by them cannot be determined 
(‘observed’ or ‘measured’) simultaneously. It should be mentioned 
that these values are to be defined in the usual way, namely, as 
the characteristic values of the corresponding matrices, regarded as 
linear operators, acting on a two-component function of the type ¥. 
Denoting these values by dashes, we have for their determination the 


equations 
oY, = 2¥e Oy Py = yy on, = o, 4, 


292 WAVE MECHANICS OF A SINGLE ELECTRON § 29 
or in components 
Or Prt rePre = % Yor 791 21+ F720 Pra = 7 Pro 
etc., that is, 
Pre = 9; Parr) Yor = P22 ) 
ye = Yr» Hy = Oy Pye > (254) 
—Py = On, Yes = 0, Peo 

whence it follows that 

o, = +1, Hr = +42 

o= +1, py = Fix, }. (254 a) 

oO, = +1, Pro = Fy ; 
The characteristic values of the rectangular components of the elec- 
tron’s magnetic moment p = yo are equal accordingly to +p. This 
means that, in determining the orientation of this moment with respect 
to some axis, we have to assume beforehand that it is parallel to this 
axis, the question to be decided reducing to the choice between the 
positive and the negative direction. In other words, we have to assume 
that the electron’s magnetic moment is guanizzed about some (arbitrarily 
chosen) axis, the two possible values of its projection on this axis being 
+p and —p, while its projection on any other axis remains undeter- 
mined. In the preceding theory this role of quantization or reference 
axis has been conferred on the z-axis. The theory can easily be 
generalized for the case when this reference axis has any direction 
whatsoever with regard to the coordinate axes. 

These results appear quite natural from the point of view of the 
general transformation theory, developed in Chapter IV. Since the 
matrices o,, o,, ¢, do not commute with each other, one of them only 
can be used as a basic quantity, not only for the determination of the 
two others, but also for the determination of the matrix o, representing 
the projection of o on any other direction n. In the preceding theory, 
this basic role has been conferred on ¢,, which appears accordingly as 
a diagonal matrix, while co, and o, are not diagonal. 

The present case can serve as a very simple illustration of the trans- 
formation theory, since we have to do with two states only, the state- 
space thus reducing to a plane in which the two states are represented 
by two mutually perpendicular axes, z, and z_ say. Replacing z as 
a reference axis (in ordinary space) by some other axis z’, we obtain 
two other states (in which the electron’s magnetic moment is oriented 
parallel to z’), which are represented on the ‘state-plane’ by two other 


§29 PAULI THEORY IN THE TWO-DIMENSIONAL MATRIX FORM 293 
mutually perpendicular axes z’, and z_ (with the same origin as the 
axes z,). If the angle between z and z’ is equal to 6, then the angle 
between the axes z, and z’, in the state-plane must obviously be equal 
to 46—-since to an angle of 180° between the direction of the positive 
and negative z (or z’) axis there corresponds an angle of 90° between 
the axes z, and z_ (or z', and z_) on the state diagram. Now, as we 
know from the general theory, the square of the cosine of the angle 
between two axes in the state-space is equal to the relative probability 
of the state represented by one of them subject to the assumption that 
the probability of the other is equal to unity. Hence it follows that 
if the magnetic moment of the electron is known to be pointing in 
a certain direction (that of +z, say), there is a probability equal to 
cos? $@ that it will be found pointing in another direction (that of +2’) 
making an angle @ with the former. The probability that it will be 
found pointing in the direction opposite to the latter (i.e. that of —z’) 
is equal to cos? }(7—@) = sin?36. We thus see that if the electron’s 
moment is known to point in a certain direction (-+z), there is a pro- 
bability equal to cos? 40 -++ sin?4@ = 1 that it will be parallel] to any 
other direction (in the positive or the negative sense). This means, as 
stated above, that the direction of the refcrence-axis to which the 
electron’s moment must be assumed to be parallel can be chosen quite 
arbitrarily. 

All these results can be considered as a particular case of those 
holding for the magnetic moment—or the mechanical angular momen- 
tum—due to the orbital motion of a (non-spinning) electron in a radially 
symmetrical (central) field of force. As shown in Chapter II, the 
z-component of this orbital angular momentum M, can be assumed to 
be quantized, i.e. to take a discrete set of (characteristic) values mh/2z 
the axial quantum number m varying from —! to +1, where / is the 
angular quantum number determining the total angular momentum 
according to the formula M? = hl(/+1)/47?, while the z- and y-com- 
ponents of M do not have definite values. The present case can be 
obtained from the general case by taking | equal to 4—i.e. by ascribing 
to the electron, irrespective of its orbital motion, a spin motion of a 
‘half-quantum’ magnitude. We have seen in Chapter III that the 
matrix representation of physical quantities, being more general than 
the operator representation, leaves room both for integral and half- 
integral values of the angular quantum number, subject to the condition 
that the axial quantum number should vary by elementary steps 
Am = 1 from —/ to +l. This vacant place, or rather the lowest vacant 


294 WAVE MECHANICS OF A SINGLE ELECTRON § 29 
step on the /-staircase, can now be filled by the electron’s spin angular 
momentum. The other—higher—steps can be represented by combining 
the latter with the orbital angular momentum—if any (see below). 
The possibility of attributing to the electron, in addition to an 
intrinsic magnetic moment up, an intrinsic angular momentum 8 pro- 
portional to it, i.e. represented by the same matrix o with a certain 
numerical factor, follows also from the fact that this matrix satisfies 
the commutation relation (253b) which is quite similar to the com- 
mutation relation MxM = —hM/2z1 satisfied by the orbital angular 
momentum M. Assuming the electron to possess an intrinsic angular 


momentum oe (255) 


satisfying the preceding relation, we get 


‘ h 
K*Oo Xo = — — KG, 
272 


as 
| > 


(255 a) 


Il 


or, according to (253b), kK 


2 ’ 
which means that the magnitude of this momentum correspor.ds to 
| = }, as was deduced above from the fact that the electron’s magnetic 
moment can only assume two (opposite) orientations parallel to a 
quantization axis. 

It should be noticed that the formula M = xe, with the above half- 
quantum value of «, does not contradict the result that the charac- 
teristic value of the square of M must be equal not to }h?/4n?, but to 
3h3/4n?, where } = 1(/+-1) with 7 = }. In fact, squaring the equation 
SS 00, WO Bob 8? = xg? = x2(o2-+ 05+ 07). 

The characteristic values of s? are obtained by substituting the charac- 
teristic values of 0%, 07, o2. Now from the definition of the matrices 
G,,0,,0,, it follows that their squares are equal to the unit matrix 


bt 


27 


10 
$= 0 } 
o2 = o2 = o2 = 8. (255 b) 
The characteristic values of the latter being equal to 1, we thus get 
3 ht 
Bice Bee © 2... 
char. value of M? = 3x i 


While the electron’s intrinsic angular momentum « has a half-quantum 
value, its magnetic moment p = he/4m,c has a whole-quantum value, 
i.e. the same value as the magnetic moment due to the orbital motion 
with the angular quantum number / = 1, The ratio of the magnetic 


§29 PAULI THEORY IN THE TWO-DIMENSIONAL MATRIX FORM 295 
moment to the angular momentum 


Be e 


_ — 


K Moc 
is thus twice as large in the case of spin as it is in the case of the orbital 
motion. 

This difference may be reduced formally to the fact that the spin 
matrix satisfies the relations (253) and (253a), which are responsible for 
the factor 2 in (253 b) and consequently for the factor 4 in (255a) (these 
relations have no parallel in the case of the matrices representing the 
orbital angular momentum). It is the fundamental cause of the com- 
plications in the action of a magnetic field on a spinning electron, 
moving in a central field of force, which are usually referred to as the 
‘anomalous’ Zeeman eficct. 

Postponing the detailed consideration of the latter till a later section, 
we shall calculate here the rate of change of the total angular momentum 
of the electron due to the couple produced by the magnetic field. If the 
preceding assumptions about the electron’s spin are correct, then we 
must have (so long as the electrostatic field can be supposed to produce 
no couple), er to the classical mechanics, 


£ (L-+xs) = 5 (b+ 2a) XH, (256) 


where L is that ce of the Bans momentum which is due to the 
orbital motion. The same equation must hold in wave mechanics if 
L and o are considered as operators and if the time derivative of an 
operator F is defined with the help of the energy operator K by means 


of the formula = : 
=[K,F]= mt (KF FX). (256 a) 


In equation (256), as operator (or operator-matrix) M = L-+-x«e repre- 
sents the total angular momentum of the electron and the operator-matrix 
noe ise"? Gm ET) 
the total magnetic moment, due ae to ite motion about the nucleus 

and the supposed ‘spinning’ about its own axis. 
Neglecting the terms proportional to the square of the magnetic 


field, we can put K = H—ngo, 
where H is the Schrédinger energy operator, 
a ee e 
Himes (557) +U- 5561 


296 WAVE MECHANICS OF A SINGLE ELECTRON § 29 
[cf. (210a, b), § 26], supposed to be multiplied by the two-dimensional 
unit matrix § [eL/(2m,c) is the magnetic moment of the orbital motion]. 

The sum of the first two terms of this operator, representing the 
kinetic energy and the potential energy of the radially symmetrical 
electric field, commute both with L and a, so that in the formula 
(256a), with F = M, we can put simply 


é 
SF a ce. 25 
K ©. L+ue), (256 b) 


it being understood that L is multiplied by the unit matrix 8. 
Now we have, since o obviously commutes with L, 


r aoe _  &| ~. 
[KL] = —5£-[(@L),L] 


[K, xo] = —xp[(8-a), 0]. 
For the sake of simplicity, we shall assume the magnetic field to be 
parallel to the z-axis (this does not, of course, involve any loss of 
gencrality). Taking the rectangular components of the bracket expres- 
sions on the right side of the preceding equations, we get, with the 


help of the equations Lx L = —hL/27i and oxo = 214, 
a 
[SL,, L,] = $[L, L,] = = $(LxL), = —§L, = —(Lx9), 
ae 
[SL., Ly] = $[L,, L,] = —" H(LxL), = HL, = —(Lx 9), 
[HL,, | = 0, 
Qt 4 4 
[So,, 02] = =-$(¢x9), = — $9, = — (6x9), 
Qn 4 4 
[S2,,0,] = —+"$(ex9), = =" ho, = —"(6xH), 


[$o,, o,] = 0. 
We thus have, returning to the vector notation, 
[(®L),L]=-Lx, — [(-0),0] = —Zoxs, 
and consequently 
‘z Pon Ohare, 2 ‘ 
[K, (L+xe)] (se-1+ i cue) XS, 
or, since x = h/4r, 
e 

[K, (L+-xa)] = (5° L-+ue) x 6, 

which is nothing else but equation (256). 


§29 PAULI THEORY IN THE TWO-DIMENSIONAL MATRIX FORM 297 

Our interpretation of the matrix o as representing a spin motion of 
the electron with an angular momentum « = h/4m7 and a magnetic 
moment p = eh/47m,c is thus fully checked—at least from the formal 
point of view. One may arguc that it cannot have an actual physical 
significance since the electron in the Pauli theory, just as in that of 
Schrédingcr, is dealt with as a point, with definite coordinates z, y, z, 
and a point-like particle cannot be imagined to be spinning. To this 
one can retort firstly, that Pauli’s theory amounts to the addition of 
a fourth ‘spin’ coordinate, giving a schematical representation of the 
spin motion; and secondly, that the translational motion—in particular 
the revolution about a fixed centre—in wave mechanics is also repre- 
sented in a schematical way only. 


30. More Exact Form of the Two-dimensional Matrix Theory; 
Electron’s Electric Moment 


Pauli’s theory, discussed in the preceding section, accounts for the 
duplicity phenomenon in the presence of a magnetic field only, whereas, 
in reality, this phenomenon is observed just as well without such a field. 
A full account of the experimental facts is given by the theory of Dirac 
which we are now going to examine on the same lines. The preceding 
analysis of Pauli’s theory will prove very helpful in the discussion of 
the mathematical form and physical meaning of Dirac’s exact theory. 
aes $3 = Xp Pe = Xa (257) 
then equations (229a) and (229b) of Dirac’s theory can be written in 
the following form: 
ced thi a (257 a) 
oUux+(u+moc)y = 0 
where o is Pauli’s spin matrix, while the operators u,=-m )c are under- 


stood to be multiplied by the unit matrix 6 = ° if} y denotes here 
the two-dimensional matrix a and x denotes the matrix pt. 
2 X2 


Applying to the first of equations (2578) the operation o-u, we get, 
with the help of the second equation, 
(o-u)*-+[(e-u)u,—u,(o-u)]x+(u,—myc)o-ux 
= (o-u)*+[(o-u)u,—u(o-a)]x— (u,—moc)(u, +m cys = 0. 
Now (u,—m,c)(u,+m9c) = uj—msc*?; we have further, according to 
(2182), he 


$505.6 Q q 


298 WAVE MECHANICS OF A SINGLE ELECTRON § 30 
and 
(o-u)® = o2 u2+...+0,0,uU,U,+... 
= (u2+u2+-u2)+-10,(u,uy—U, Uz)+..., 
i.e., according to (218), 


he he 
a a ee aes | 
(o-u)? = u io, H u ae 
Putting, for the sake of brevity, u?—u?+m2c? = D, we thus get 
Db "tole Ey) = (258) 
2c 
In a similar way we obtain the equation 
he ; 
deg ae aes = 0. 25 
Dx — o(Hy—1Ey) = 0 (258 a) 


These equations are equivalent respectively to the second-order equa- 
tions (230) and (230a) of the Dirac theory and could, of course, be 
derived directly from the latter. 
The expressions (232) and (232a) for the probability density and the 
probability current-density can be written in the form 
p= b+ x'y, (259) 
j = c(ptox+ x'ayp). (259 a) 
In the case of a conservative motion with a positive energy « which 
differs relatively little from the rest energy myc, the functions y can 
be expressed in terms of ys with the help of the relations (233 a) or 
=5 ae o-uy, (260) 
which is the approximate form of the first of equations (257 a). 
Using the relation 
g,(8-U) = 0,0, U,+0,0,U,+0,0,U, = Up+19,U,—I0, U,, 
that is, o(o-u) = u+iu xa, (260 a) 
we get, substituting the expression (260) in (259 a), 


ee ae 
j= oon” Oe oY u X of-+ conjugate complex, 


which is easily reduced to the approximate form (234b) with 
M = py 'op, in agreement with (252b). As a matter of fact, we have 
merely repeated the argument of § 28, using the new matrix notation 
to illustrate its convenience. 

The equation of Pauli’s theory was obtained from (258) by neglect- 
ing the last term (proportional to x) and replacing the two terms 
—uj-+mjc? in the relativistic operator D by 2m,(p,+-U). We shall get a 


§ 30 MORE EXACT FORM OF THE MATRIX THEORY 299 
better approximation if we substitute in (258) the expression (260) for 
x—which gives an additional term of the second ordcr in 1/c—and 
introduce a correction term of the same order in the expression for D. 
Limiting ourselves, for the sake of simplicity, to the case of conservative 
motion, and putting « = m,c?+ 4K and «’ = m,c?+ K’, we have 


U, == —=(¢—U) = — = (me? K’—U). 


This gives D = u?—2m,(A’— U)—(A'—U)?/c?, so that equation (258) 


assumes the form 
he 


[u2—2m,(K’—U) — (k= U)'W— 5 @-(Hy—iEy) = 0. 
Neglecting the relativistic corrections, i.e. putting ¢ = oo, we obtain 
the ordinary Schridinger equation 
{u?—2m,(A’—U)]p = 0, 
whence it follows that, with an accuracy of the order of 1/c?, we can 
replace the operator (A’— U)?/c? by u4/(2mgc)? == (u2+ u2+u)*, (2m9c)?. 
The preceding equation thus reduces to the standard form 


(K—K’')b = 0, 
with the energy operator 
a ee ee ee | ee 
bili a (2m,)8e2™ — |H a aa u)}. 


With the help of the formula (260 a) the last term in this expression 
can be rewritten in the form 
1 ; 
The operator inE-u represents a purely imaginary quantity whose 
average value vanishes and which can therefore be left out of account. 
Putting yo = p, we thus get 
K= (s.m+U}+5. (261) 
29 


where the first term represents the usual (Schrédinger) energy operator 
(multiplied by the two-dimensional unit-matrix 5), while the operator 


Bias . 


~ (2mg)%c? 


can be regarded as a kind of perturbation energy, which specifies, with 


] 
‘i_H-u—E- 
u‘—H-p—E is Xp (261 a) 


t In fact the product os E-u is approximately equal to tho work done on the electron 
° 


per unit time, i.e. to —dU/dt; in the case of a stationary motion its average value 
must obviously be equal to zero. 


300 WAVE MECHANICS OF A SINGLE ELECTRON § 30 
an accuracy of the second order in 1/c, the influence of the relativity 
corrections. One of these, represented by the first term in S, refers to 
the variability of mass with velocity, while the other, represented by the 
second and third terms, corresponds tothespin phenomenon. The second 
term, which has been discussed already in the preceding section, can 
be regarded as the additional energy due to the electron’s intrinsic 
magnetic moment p. As to the third term, it can be interpreted in 
a similar way—namely, as the additional energy due to the presence 
of an electric moment represented by the operator 


< 


1 
= -—--- UX pp. 
2moc # 


We are thus led to regard the clectron as a particle combining the 
properties of a point charge, of an elementary magnet, and of an 
elementary electric dipole, with an electric moment proportional to the 
magnetic moment (uw) and to the velocity of translational motion, 
represented approximately by the operator u/mp. ‘ 

It should be mentioned that the association of an electric moment 
with a moving particle which is known to possess, when at rest, a 
magnetic moment, is a direct consequence of the relativity theory as 
applied to the connexion between the magnetic and the elcctric field. 

If we have, for example, in the coordinate system A only a magnetic 
field H (E = 0), then in another system A’ which is moving relatively 
to the first with a velocity v’ = —v, we must have, in addition to 
a magnetic field H’ which is slightly different from H (the difference 
being of the second order in v/c), an electric field 

E’ = —vxXH’/e = —vxH/c, 
and vice versa: in the case of the presence of a pure electric field 
E (H == 0) in the system A, there must be, in the system A’, besides 
an electric field E’ somewhat different from E, also a magnetic field 
H’ = vxE'/e=vxEv/c. 

Let us consider in the latter case a particle which is moving with the 
system A’ and which, with regard to this system, possesses a magnetic 
moment p. It will have accordingly an additional magnetic energy 
U’ = —p-H’ = —p-vxE’/c = p-v’ x E’/c. Now this energy can be 
expressed in the form 

U' = 5B'-@xv) 


or = —*E-(v'xw) 


{30 MORE EXACT FORM OF THE MATRIX THEORY 301 
and interpreted as the additional electric energy with regard to the 
system A of an electric dipole with a moment 
= IY Xp. 
c 

We are thus entitled to assume that a particle which, when at rest, 
behaves like an elementary magnet with a moment p acquires, when 
moving with a velocity v’, an electric moment v’xp/c. This result 
can be obtained directly with the help of the spinning sphere model of 
the electron, if due account is taken of the redistribution of the electric 
current density produced by the superposition of the translatory 
motion on that of rotation.t 

Replacing the velocity v’ by the operator u/mm,, we obtain for the 
representation of the electron’s electric moment the operator 


] 
v= ---UXKp, 261b 
Mo C oe ( ) 
which is just double the previous expression. The additions. electric 
energy, represented by the last term in (261 a), must be written accord- 


ingly in the form UF xs IB, (261) 
while the magnetic energy is expressed in the usual way by 
U,, = —Hu. 

The origin of the factor 3 in (261 c) can be interpreted in different 
ways. It can be obtained, in the first place, by applying the relativity 
theory to the spin motion.{ It is simpler, however, to connect it with 
the fact that the energy U, corresponds to a second-order effect (while 
U,, corresponds to a first-order effect), as in the familiar case of a 
particle possessing no rigid electric dipole moment, and acquiriig such 
a moment under the influence of the electric field only. In the present 
case, this influence is an indirect one, proceeding through the velocity 
of translational motion which is maintained by the electric field. 

Before discussing the exact theory of Dirac, we shall apply the pre- 
ceding corrected form of the Pauli theory to the «pproximate calculation 
of the so-called ‘relativity corrections’, i.c. of the shift and splitting of 
the energy-levels of an electron moving in a spherically symmetrical 
electric field with or without a homogeneous magnetic field superposed 
upon it. 

~ Soe my Lehrbuch der Elektrodynamik, vol. i, pp. 295-6. 


t See L. H. Thomas, Nature (1926), p. 514, and Phil. Mag. (1927); also J. Frenkel, 
Zetts. f. Phys. 37 (1926), 273. 


302 WAVE MECHANICS OF A SINGLE ELECTRON § 30 
A. No magnetic field 
The perturbation energy reduces in this case to 
1 
S=_—.-.. pl_..%..E(pxa), 262 
(2m, )8e2P 2myc (P x9) 202) 
where p = AV/27mi is the operator representing the electron’s momen- 


tum. Putting 


E = 4, 


re 
which corresponds to a Coulomb field of force produced by a nucleus 
with a charge Ze, we get 


E:(pxe) = o-(Exp) = Zo-(r xP) =e 


rs 


o-L, 


where L = rxp is the operator of the electron’s angular momentum 
(without the contribution xo due to the spin). Substituting this expres- 
sion in (262) and replacing p?/2m, by H’—U == H’-+Ze?/r, where /1’ is 
the unperturbed energy, as given by Schrédinger’s or Bohr’s theory, 


we get > * Zer\2 
Ce / --} ---(L- 262 
sara (H + r ) Fa a)], ae) 
Ze*h 
h — = 
where a = cuZe dnt? 


the charge of the electron being denoted by —e. 

The expression (262 a) is somewhat similar to the expression (150) for 
the magnetic perturbation energy, differing from it in the first place 
by the fact that the constant magnetic field is replaced by a kind 


of effective magnetic field 
Ze 


2m, crs L, 
which is inversely proportional to the cube of the distance from the 
nucleus and parallel to the vector of the angular momentum L, and in 
the second place by the appearance of the additional term 


1 Ze?\? - 
eh ee Wy 
smal s e ) 


which is supposed to be multiplied by the unit matrix 5 = {0 if 


(262 b) 


ete = 


The argument used for the solution of the magnetic perturbation 
problem in the previous section can thus be applied, practically without 
any modification, to the present case; it can be simplified by using from 
the outset a coordinate system with the z-axis parallel to the vector 
L (which is a constant of the unperturbed motion). 


§ 30 MORE EXACT FORM OF THE MATRIX THEORY 303 
The result is expressed by the formula 


, ] es 
AH = ~ sycal(#+“) +02(5)) sii 

where the averaging is to be carried out for the unperturbed motion 
with the help of the usual (scalar) Schrédinger function y specifying it, 
according to the formula F = f Fy* dV. The preceding formula can 
be interpreted by assuming two types of the perturbed motion with the 
electron’s spin axis parallel to the axis of the orbit and having either 
the same or the opposite direction (L-s = +L). The numerical values 
of AH’ = AH’, can be computed approximately by replacing the wave- 
mechanical averages or probable values by the time averages of the 
classical (Bohr) theory. The latter givest 

11 1] | 
ra re ab’ 7h BS 
where a is the semi-major and 6 is the semi-minor axis of the electron’s 
elliptical orbit. We thus get 


Osc oe 2Ze*H’ | Zet aL 
AH =~ 5.) |e" pee te “A (263 a) 
Now according to the Bohr theory we have further: 
_ ssn? _k _h ,_ _ Ze 2m, Zet 
= rim AD? 9 ag = ag = Ra 


where 7 is the principal and k& the angular quantum number. Sut- 
stituting these ease in (263a), we find 


Ht42% CH + oe = (- 3+ =) H" 
ab Zehhk n° _ (Ze?) h2n8 1 2n H”? 


— a dnm, 27 8a*~ 4a® 2ntm, Zea kF’ 
, 2H’ ] ] 
whence Ai’ = ms er i- 1 n(; os i)| ‘ (263 b) 


This formula was originally obtained in 1925 by Uhlenbeck and 
Goudsmit in practically the same way as that shown above, without, 
however, any use of the matrix o (the product L-e being replaced by + L 
on the assumption that the electron’s axis can have only two opposite 
orientations parallel to the axis of the orbit). 

By applying relativity mechanics to the stationary states of the Bohr 


¢ Cf. Born, Atommechanik, i, p. 164 (Berlin, 1925). 


304 WAVE MECHANICS OF A SINGLE ELECTRON § 30 
theory, Sommerfeld, in 1915, derived the following formula: 
272 7-k 
can = Moe? [ 1+ UE a| (264) 
which proved to be in exact agreement with the experimenta] data for 
the energy-levels in hydrogen and ionized helium. Here y is a dimen- 


sionless constant e 
y= =7,10-3, (264 a) 


8 = n—k is the radial quantum number, and 
ki = /(k®—y?Z?). (264 b) 
The constant yZ determines the ‘relativity splitting’ of the energy- 
levels belonging to the same value of the principal quantum number 
m, and so determines the ‘fine structure’ of the spectrum. When 
yZ <1, we can replace formula (264) by the approximate formula 


2W2/(3 n 
Enk En == ——4(7—7) (264 c) 
ota 2Z2 In2m, 04 Z3 
where W), = €,—myc? = — Mee es =— grind stands for H’. 


This fine-structure formula of Sommerfeld has been brilliantly con- 
firmed not only for hydrogen and ionized hclium, but also for X-ray 
spectra of the heaviest atoms. The number of lines given by it in the 
latter case (with k = 1, 2,..., n and with regard to the selection rule 
Ak = +1), or the number of energy-levels in the absorption spectrum 
of X-rays comes out, however, too small, being equal to n instead of 
2n—1, as found experimentally. Thus, for cxample, we have, when 
n = 2 (L-group), three energy-levels, while Sommerfeld’s formula only 
gives two (k = 1 and k = 2); when n = 3 we have five levels instead 
of three, etc. 

This difficulty was removed by Uhlenbeck and Goudsmit’s theory of 
the spinning electron. To every orbit specified by the numbers n, k 
there are two possible oppositely directed orientations of the spin axis 
perpendicular to the plane of the orbit. Corresponding to these two 
orientations, we must have two different additional energies which 
bring about the doubling of all the energy-levels ¢,,, according to the 
formula (263 b). 

However, some secondary difficulties remain unexplained by this 
theory: First, one of the levels belonging to the same principal quantum 
number (n) should remain undivided (since the number of different 
levels is equal to 2n —1 and not to 2n). This can be explained at once 


§ 30 MORE EXACT FORM OF THE MATRIX THEORY 305 
if we ascribe to the angular quantum number the values 0, 1,..., n—1 
instead of 1, 2,..., 2, i.e. if we introduce straight-line orbits instead of 
circular ones—beeause obviously for such straight-line orbits all orienta- 
tions perpendicular to the direction of motion are equivalent. It 
should be noticed, however, that the approximate formulae (263b) and 
(264c), as well as the exact formula (264), cannot be applied to the 
case k = 0. 

Secondly, for hydrogen and ionized helium—briefly in the case of 
atomic systems with a single electron—the experimental data fit exactly 
with Sommerfeld’s formula both with regard to the number and the 
position of the levels, if k is assumed to take the values 1, 2,..., n. 

This difficulty can also be overcome by a more exact analysis of the 
‘splitting due to spin’ and its comparison with that due to the variability 
of mass (‘relativity splitting’ in the sense of Sommerfeld’s theory). 

Formula (263b) is not valid for k = 0. In general, it is so much 
the more accurate the larger k is. In this limiting case we have 


] | 1 
ET 94a = EL] 

so that formula (263b) becomes identical with Sommerfeld’s formula 
(264c), provided k (=n, n—1, n—2,...) is replaced by k—}3, each 
energy-level appearing twice for two consecutive values of k (the one 
increased and the other diminished by 3). 

The appearance of half-integral values of k (= n—}, n—3, etc.) can 
be explained by the fact that on the wave-mechanical theory the angular 
momentum L is equal to ,/{l(/+-1)}h/27, and not to hk/27. Now since 
Ud+1) = (1+ 4)*—}, we can put, for large values of /, 


L= 0+») =20-», 


where / = k—1 is the angular quantum number of the Schrédinger 
theory.f 

The average values of 1/r, 1/r?, and 1/r? have been calculated above, 
for the sake of simplicity, with the help of the old quantum theory; it 
can be shown, however, that the results obtained are not substantially 
altered on the Schrédinger theory if Bohr’s k is replaced everywhere 
by 1+}. 

We shall see in a later section that the exact wave-mechanical theory 
based on Dirac’s equation leads, in the case of a one-electron atomic 
system, to precisely the same results as the old theory of Sommerfeld, 


t Cf. infra, § 33. 
3595.6 Er 


$06 WAVE MECHANICS OF A SINGLE ELECTRON £30 
the spin-doubling remaining unrevealed. It becomes manifest, however, 
as soon as we turn to more complicated atoms in which the motion of 
each electron takes place in a field of force deviating (owing to the 
action of the other electrons) from the purely Coulomb one. This 
follows immediately from the expression (263) in which 1/r* must be 
replaced by some other (more rapidly decreasing) function of the 
distance, with the result that the two terms of (263)—corresponding 
to the relativistic variation of the mass and to the spin effect—can no 
longer be combined into a single term, corresponding on the old theory 
to the mass effect alone. 

The two states resulting from a single state of the Schrédinger theory 
and specified by the orientation of the electron’s spin angular momen- 
tum in the direction of the orbital angular momentum or in the opposite 
direction are distinguished with the help of a special quantum number 
(formerly called the ‘inner’ quantum number) 7, assuming the value 
j = 1+} for the former state and the value 7 = /—$ for the latter; the 
product of j with A/27 can be regarded accordingly as the resulting 
angular momentum of the electron. This interpretation corresponds 
rather to the old quantum theory; it can be shown, however, that in 
wave mechanics the number j plays, in regard to the total angular 
momentum M, exactly the same role as the angular quantum number 
lin regard to the orbital angular momentum L. We have, for instance, 
for the characteristic values of M? 


which can be obtained from the formula M? = (L+s)? = L?+2L-8+38?, 
where s denotes the spin angular momentum, if we put 8? == 3h?/4n?, 
L? = h2l(l+-1)/47?, and 2L-s = hl/47? in the case j = 1+} (in the case 
j = l1—}, 1 must be replaced by /—1). 

As has been shown above, for a motion in a Coulomb field of force the 
inner quantum number j also plays the same role as ]—in the absence 
of spin—with regard to the energy. 

We shall presently see that this correspondence between j and I can 
be further extended in describing the splitting of the energy-levels 
produced ‘by a weak magnetic field. 


B. Influence of a magnetic field (Zeeman effect) 
The preceding theory can easily be generalized to allow for the 
presence of a homogeneous magnetic field §. The radially symmetrical 
electric field will be represented by the vector E = f(r)-r. 


§ 30 MORE EXACT FORM OF THE MATRIX THEORY 307 
If the unperturbed motion is defined as that corresponding to the 

absence of the magnetic field and to the neglect of the relativity (mass- 

spin) corrections, i.e. if it is specified by the ordinary energy operator 

= 5 P+ U(r) (multiplied by 6= ‘a it} then neglecting terms 

0 
of the second order in § we can represent the complete energy operator 
K as the sum of H and of the perturbation energy 


[(H’—U)-+epfl-e}+ 5 L-+p0), 


2m, c? 
where yp = he/(4mm,c) is the absolute value of the electron’s intrinsic 
moment, the electronic charge being denoted by —e so that 


e 
2myc 


ee 


1dU 
—ek = —V nn geeemeriginatng 
e U, or fir) er 
This can be written in the form 

S—=A+4+Be (265) 

y wigs ‘_ mye, © @. 
with A= Smoot U) Tome” L (265 a) 
and B = —fL+ ,9, (265 b) 


where B = pf/(2myc). 

The determination of the encrgy-levels of the two perturbed states 
resulting from a single unperturbed one can be carried out with the 
help of the general method outlined in the preceding section in con- 
nexion with a perturbation due to the magnetic field alone (see equations 
(250)-(251 b)]. We thus get 

AH’ = A+B, (266) 
where A = { ¥*AypdV and B = J{(B,)?+(B,)?+(B,)?} is the quadratic 
average of the vector B. If L is dealt with as a constant vector (which 
is quite exact for the unperturbed motion), we have 

B= \{(B)*L?— 2Bys-L + pH}. (266 a) 

In the extreme case of a very strong magnetic field—such that pH > BL 

—this expression reduces to pH. Putting, further, H-L = hHm,/27, 

where m, is the axial (magnetic) quantum number for the orbital 

motion, and neglecting the first terms in (265a) and (265b) compared 
with the second ones, we get 

AH’ = p(m,+1), (266 b) 

i.e. the same result as in the case of the ‘normal’ Zeeman effect, corre- 

sponding to the absence of spin; the influence of the latter is expressed 


308 WAVE MECHANICS OF A SINGLE ELECTRON § 30 
in the replacement of the axial quantum number m, by m = m+], 
both numbers being integers. 

In the opposite case of a very weak magnetic field (uH < BL) we 
obtain a splitting of a different type, usually denoted as the ‘anomalous’ 
Zeeman effect. Expanding the exact expression (266 a), and neglecting 
the terms of the second and higher orders in §, we get 


B = BL(1—BySy-L/p?L?) = BL—psy-L/L = BL—pm,h/20L 


or, putting L = A(l+})/27 and neglecting the ‘relativity correction’ 
(represented by the first term in (265a)), 

AH = +AL+p$m(1+, 4), (266 c) 
where the upper and lower signs refer to the values 7 = /+ 3 and 
j = 1—4} of the ‘inner quantum number’ which determines the total 
angular momentum M. 

This result in a somewhat different external form involving the axial 
quantum number m; which determines the component of M along the 
magnetic field, so long as the latter is supposed to be weak, can be 
obtained by the following simple argument. 

We have seen above that in the absence of a magnetic field the 
vectors L and s (spin angular momentum) are not constants of the 
motion, even if the latter takes place in a radially symmetrical electrical 
field; the sum L+s = M (total angular momentum) is, however, con- 
stant in this case. Further, it can easily be shown that the squares of 
s and LI remain constant, so that the perturbation produced by the 
spin alone can be pictured as the rotation (precession) of the two vectors 
L and s of constant magnitude bout their resultant M (Fig. 3). The 


§ 30 MORE EXACT FORM OF THE MATRIX THEORY 309 
average values of s and L must therefore be parallel to M, and can be 
expressed accordingly by the equations 

S=:(g—1)M, L = (2—g)M, (267) 
where g is a certain numerical cocfticient (S+L =- s+L = M),. 

It should be mentioned that this ‘graphical’ representation of the 
spin perturbation does not give correct results if we assume at the out- 
set that the vectors s and L are parallel to cach other (in the same or 
opposite directions), as has been concluded previously from equation 
(263). 

The coefficient g can be determined with the help of the formula 
DL? == (M—s)* == M?—2M:s-+s* if we put L? = hl(l+1)/4n?, 
M? == h?j(j--1)/40?, 8? == 3h*/4a° and replace the scalar product M:s 
by (g—1)M*. This gives 

gg eT as 
2i(7+1) 


that is g—1l= +." (7 = 7+}). (267 b) 


; (267 a) 


The perturbation produced by a sufficiently weak magnetic field can 
be pictured in the same graphical way as the ‘rotation (precession) of 
the parallclogram, formed by the vectors s, L, M, as a rigid body about 
the direction of the magnetic field, the magnitude of all the three 
vectors remaining thus constant as before. 

The additional magnetic energy can be determined to the first 
approximation as the average value of the acai perturbation energy 
ore == - (L -+ 2s) =o ae Sy (M -+- ‘S) 

oe 
for the unperturbed motion. Replacing s ie (g—1)M, we get 
Sy 22 AU = oe M). (268) 


The factor g was introduced for the first time by Landé (in 1922). 
It can be interpreted as the ratio of the angular velocity of precession 
of the (s,L) parallelogram about the direction of to the classical or 
‘Larmor’ angular velocity w = e§/(2m,c), which corresponds to the 
absence of spin. 

The projection of the vector M on § preserves a constant quantized 
value which can be shown to be given by the formula 


> A 


m 


310 WAVE MECHANICS OF A SINGLE ELECTRON § 30 
where m, is the axial quantum number. For a state with a given j it 
can assume the 2j-++1 half-integral values lying between +7 and —j. 
It thus plays, with regard to 7, exactly the same role as the ordinary 
axial quantum number m, with regard to / in the theory of the spinless 
electron. 

With the help of (268) the expression (268) can be rewritten in 
the form 


, 1 
AH’ = pQm,g == pom,( 4 ici) (268 b) 
It differs from (266 c) (without the term BL not involving the magnetic 


] 1 ; 
ria acpi 
difference is, however, easily seen to correspond to the connexion 
between the projections of the vectors L and M on the magnetic field. 
Replacing the vector L in (266a) by its average value according to 


(267), we get 


field) by the fact that m, is replaced by m, and 


h 
SL = (2—9)-M = (2—9)—-m,§, 


and consequently 


AH’ = p§mj2—9)(1-+ 4) 


instead of the expression (266c)—or that part of it which is propor- 
tional to . Equating this to .Hgm,, we obtain the following equation 
for the factor g: 


whence approximately 


2(g—1) = + 


which coincides with (267 b). 

Each level, specified by the quantum numbers n,1,j, is split up in 

a weak magnetic field into 2j+ 1 equidistant levels with the spacing 
AH; = rom = 94H = (14 57-5) a8: 

where the plus sign refers to the case 7 = /+4 and the minus sign to 

the case 7 = /—}. 

We have assumed above that the magnetic field was ‘sufficiently 
small’. The standard field with which it has to be compared in this 
sense is the ‘effective’ magnetic field which determines the spin per- 
. turbation in the case § = 0. This field is paralle] and proportional to 
L, as has been shown above [cf. eq. (262b)], and can therefore be 
defined by the formula §.4, = BL. 


J 
14-3" 


§ 30 MORE EXACT FORM OF THE MATRIX THEORY $11 

If § is much larger than §,,,, the vectors L and s are no longer 
held together in the rigid parallelogram (Fig. 2), but must be imagined 
to precess independently about the direction of %, the former with the 
normal Larmor frequency and the latter with twice this frequency. We 
get in this case, instead of (268), 


Sq = re (Ly + 283) = wH(m+1), 


in agreement with (266b). The modification of the Zeeman effect which 
takes place in a transition from a weak magnetic field to a strong one is 
known as the Paschen-Back effect. 

The preceding results will be established in a more rigorous and 
complete way in a later section on the basis of Dirac’s exact theory. 


31. The Exact Four-dimensional Matrix Theory of Dirac 
The four equations of the Dirac theory, which in the last section were 
written in the form of two matrix equations of the Pauli type, can be 
put in the form of a single matrix equation (they were actually first 
given by Dirac in this form), in a way perfectly similar to that which 
has been applied for the same purpose to the Pauli equations. 

The four functions of Dirac, ¥,, %., %5, 4, will be considered accord- 
ingly as the four elements of a one-column matrix: 


pa {ve (269) 
he 


(or the components of a four-dimensional vector), the adjoint matrix 
(complex conjugate vector) being j 


ot = Vio vt ut). (269.8) 

Introducing a suitably defined square matrix of the fourth rank 

(four-dimensional tensor) A we can represent the four first-order equa- 

tions (229a)-(229b) as the four components of the matrix (or vector) 

equation Ay = 0, (270) 
writing them in the form 

(Ap), = Ay pi, +ArepetAis¥st Aut, = 0 

(Ap)s = Ant, t+AaspetArspstAnt, = 0 

(Ap)s = Agyp,t+Asa¥2t+AssHst+Ass, = 0 

(Ap), = Ag titArspetAaspstAut, = 0 

Identifying these equations respectively with the first, second, third, 


(2708) 


312 WAVE MECHANICS OF A SINGLE ELECTRON § 31 
and fourth equations (229a)-(229b), we get the following definition of 
the matrix A: 


A = Uz tay Uy +X, Uy + % Uy + Ay MgC (271) 
with 
1000 —i 0 00 01 00 
0100 074 00 —10 00 
ot. = » ty = e , & = 
0010 00 —23 0 00 01 
0 ] 00-10 
0 0 00 0 2 (2718) 
0001 00 0 —1 
_Joo1ol  _joo-1 0 
“=lo10o0(°" j\o1 0 0 
1000 10 0 0 J 


This form of the Dirac equations corresponds to a privileged role of the 
coordinate x, the associated matrix a, reducing to the four-dimensional 
unit-matrix 5. It is possible, however, to rewrite them in four other 
equivalent forms, corresponding to the shifting of this privilege to one 
of the other four matrices a. This can be done in the simplest way by 
rearranging the original equations (229a)-(229 b) and eventually mul- 
tiplying them by —1. For instance, to reduce the matrix a, to 5 we 
multiply the two equations (229a) by —1, and rewrite the four equations 
in the reverse order. We thus get 


(uz+ tu, )Yp.— Uz Pat (u+-myc)y, = 0, 
(U,—tUy gt U, Pat (u,+Mp C)pe = 0, 
—(Uz+ Uy pot uz, +(—Uy+ myc), = 0, 
— (uz— tu, — 4, Po+ (—Uy,+ mg c)Y, = 0, 
which can be written in the form 


By = 0, (272) 
with B= By Uzt+By Uy +B, U, +B; U%+ By mM ¢, (2724) 
where 
0 001 0 0 Of 0 0--10 
_f 0 010 _Jo 0-10 Jo 0 01 
Be 0-100" 0-1 0 of? F 1 0 00 
—1 000 i 0 00 0-1 00 
10 0 0 1000 ° 
01 0 0 0100 
aie g{? Fo = 0010 
00 0-1 0001 


(272 b) 


§ 31 EXACT FOUR-DIMENSIONAL MATRIX THEORY OF DIRAC 313 
Rewriting equations (229a)-(229b) in the inverse order without 
multiplying (229a) by —1, we get in a similar way 


Ty = 0, (273) 
with T= y, Ugt Vy Uy HV UstVUtVomMoc, (273 a) 
where 
0001 00 0: 00 —10 
0010 00-10 00 01 
Y%e~\o100f %™~) o« oof % \-10 oo 
1000 —~i0 00 01 00 ror 
1000 10 0 0 
0100 01 0 0 
“=\qoo010 —\o0—-1 of ~* 
0001 co 6-3 


This last form of the Dirac equations is especially uscful because the 
matrices y arc all Hermitian, while the matrices a and B are not. There 
is, moreover, a very simple relationship between the Dirac matrices 
Yr Yy y- and the Pauli ‘spin’ matrices o,, o,, ¢, which can be expressed 
by the equations 


AP _ {0 »o, ._. {0 a, 5 
Yr > (o. Oo, Vy = a op Ye > (. a (274) 


with 0 meaning the two-dimensional zero matrix kf ai The Dirac 


matrices y,, y,, y, can be thus defined as ‘supermatrices’ of the second 
rank, whose elements are constituted by the corresponding Pauli 
matrices and the two-dimensional zero matrices. 

Further, it can easily be shown that the matrices y,, y,, y-, just like 
the Pauli matrices o,, ,, o,, anticommute with each other and with the 
matrix yp», 80 that putting for the sake of brevity 


Ce mae 3g ame Ys = Ya» Yo = ¥4 
(y, must be left aside, since it is equal to the unit matrix 8), we have 
VpVv = Vo p (u # v). (2748) 
A relation of the type o,0, = ia,, etc., does not hold, however, for the 
matrices y,, y,, y,- We have, for instance, according to (273b), 


—i0 00 —10 00 

0i OO .J 01 OO  .fo, 0 
Te) Cg mh OL OD AO =i(F a 

00 0% 00 01 


which is different from ,. 
3695.6 88 


314 WAVE MECHANICS OF A SINGLE ELECTRON § 31 

To equations (274a) we may add the equations 

yi =8 (274 b) 
which are easily verified. 

It should be mentioned that the four matrices « or B (which are 
different from 8) also satisfy anticommutative relations of the type 
(2744), while their squares are equal to +8. We have, namely, 

2 — ae ce 
ae ai pala oes (274) 


oe 2 __ 
= af = af = —, of = 


and, of course, 6% = a2 = 6 (since fy = a, = 5). 

With the help of these relations the transition from one form of 
Dirac’s equations to some other equivalent form can be carried out by 
the multiplication of the former by that matrix which must be replaced 
by 8 (with the + or — sign as the case may be). We have, for example, 

A=y,l,B=yV0,A = —£,8,T =ajA=£8,B 
which means that 
Oe = Vir ty = V2 Vy % = Ye Yer U = YY = Yar % = V2 Voi By = Yo» 


etc.; these relations can be verified directly. 

We can further easily derive from the first-order equations the 
second-order equations of Dirac’s theory in a similar matrix form. This 
can be done in the simplest way by applying to the equation By = 0 
the operator 


B = (B, uz +B, uy+B, ut B, U)+Bo MgC. 


We thus get BBy = 0, or, carrying out the multiplication and taking 
account of the relations (274 a) and (274b): 


{(ug+-uj + uz—uj+m5c*)—[B, B,(uy u,—U, Uy) + By Ba (Uy U,— Uz Uz) + 
+8, B,(uzuy—Uu, Uy )+ Bz Buz uj—U,U,)+ 
+Py Buy U— UY Uy) +8, Bu, Uy— Uy u,) |} = 0. 


This equation can be written in the form 


Qy = 0 (275) 
with the matrix operator 
he 
Q = D3—— (HE+E-4), (275 a) 
where D = w2+ui+u2—uj+mic?, 


as before, while § and » are vector-matrices with the rectangular 


§ 31 EXACT FOUR-DIMENSIONAL MATRIX THEORY OF DIRAC 315 
components 


0100 027 OO ‘es 0 0 
1000 —10 00 01 00 
fe=Jooo01f ¥=) 00 o 6 =) 00-10) 275%) 
0010 00-21 0 | 00 Ol 

0 0 0—2 00 O1 0 O72 0 
|} @ Os of | 00-16 0 00-i 
ve o-i o of ™~\) 01 oof ==)i 00 0 
—rz7 0 0 0 —10 00 0-10 0O 
(275c) 
We can also write down the relations 
E5 = 1B, Bs £, a 1B, Bz, é, — 1B, By (276) 


Qz = 1B, By, Qy = 1B, B,, = 1B, B 
or in vector notation 
B= )BxB, 1 = iBB, = —ty. (276 a) 

The identity of equation (275) with the four equations (230)-(230 4) 
is easily verified. 

It should be mentioned that the actual way in which Dirac first 
obtained his first-order equation By = 0 was to some extent the reverse 
of the preceding derivation for the particular case of the free motion when 
the matrix Q reduces to the operator D (multiplied by 5). Assuming the 
possibility of representing Q in this case in the form BB one can easily 
obtain the conditions Bi = Bj = Bp; = —fj = —8 and 8,8, = —B,B, 
(u ~ v) for the matrices 8; after this the first-order equation By = 0 
is naturally generalized for the motion of the electron in an arbitrary 
field of force (by replacing p by wu), and finally the corresponding 
generalized expression for the second-order operator Q is obtained in 
the way shown above. 

We have preferred to this straightforward method of Dirac the some- 
what more lengthy and complicated path starting with Maxwell’s 
equations, because of the resulting gain in the comprehensiveness of 
the theory. Moreover, the determination of the matrices B from the 
properties above stated is an ambiguous problem, which can be solved 
only after some assumption has been made as to their rank, i.e. the 
number of wave functions ¥, whereas in our derivation this number is 
settled from the beginning with the help of the analogy between 
d’Alembert’s equation and Maxwell’s equations on the one hand, and 
the wave-mechanical equations of the second and first order on the 
other. 


316 WAVE MECHANICS OF A SINGLE ELECTRON § 31 

The four-dimensional second-order equation (275) is equivalent to 
the two equations (258) and (258 a) involving the two-dimensional] Pauli 
spin matrix «6. The Dirac matrix § can be defined as a duplication of 
the latter according to the formula 


E = fs “) (277) 


where 0 is short for the two-dimensional zero-matrix (5 a This 


formula is equivalent to the following three: 
= (eo | = od = {i 
és f oO, , fy e Cy , ‘ 0 o, : 
which differ from the formulac (274) for y,, y,, y, by the fact that the 
duplication is carried out in the direction of the right diagonal and not 
of the left onc. The formulae (274) can be replaced by the single vector 


formula be fires 
~ \o of 
The vectors y and € are easily seen to be connected with cach other 


by the relations 


y=pE =", E=py=y, (2774) 
where p is the scalar matrix 
0010 
0001 08 , 
P= vo00l= eo aH (277) 
0100 
which commutes with &, y and anticommutes with y,: 
PYo = =YoP: 


It should be mentioned that y, commutes with & (since it anticommutes 
both with y and with p). We have further, from comparing (273b) 
and (273c): 

y= —ipk. (277 c) 
The expression (275 a) for the matrix operator Q@ can thus be rewritten 


in the form kc 
Q = D— = (Hpk), 
17C 


where the factor 6 is to be understood in D. 

It is clear that the matrix —& must have in the Dirac theory a similar 
physical meaning to that of the matrix o in the Pauli theory, i.e. it must 
represent, with a suitable numerical factor, the spin angular momentum 
or the magnetic moment. The matrix y must represent accordingly, 


§ 31 EXACT FOUR-DIMENSIONAL MATRIX THEORY OF DIRAC 317 
when multiplied by y, the electric moment of the electron. An important 
distinction between the matrices & and » consists in the fact that the 
former is Hermitian and therefore represents a real quantity (with the 
characteristic values +1), while the latter is anti-Hermitian and there- 
fore represents an imaginary quantity (with the characteristic values 
+1). 

This result seems at first sight to contradict the conclusion arrived 
at in the preceding section, namely, that a moving electron possesses 
a real electric moment represented approximately (in the corrected 
Pauli theory) by the matrix uxo/(2m,c). As a matter of fact such 
a contradiction does not exist, for the matrices y§ and py represent 
the ‘rest-values’ of the magnetic and electric moments, i.c. their values 
in a system of coordinates with respect to which the electron is at rest. 
In a coordinate system with respect to which it is in motion, the 
electron has an additional imaginary magnetic moment and an addi- 
tional real electric moment. these additional moments being numerically 
equal and to a first approximation proportional to the velocity. 

From the point of view of the classical theory, if w and v are the 
rest-values of the magnetic and electric moments of a particle, then in 
a coordinate system with respect to which this particle is moving with a 
velocity v it will have an additional magnetic moment Ap equal, to 


: oe Vv bag 
a first approximation, to — x v and an additional electric moment Av 
C 


nat v ; 
equal (to the same approximation) to - xy. Putting v = ip we get 
; 


Ap == 1Av. The numerical equality of the two moments is thus main- 
tained for a moving electron (it can easily be shown to hold exactly), 
the imaginary electric moment giving rise to an imaginary magnetic 
one and the real magnetic moment to a real electric one. This real 
electric moment is represented wave-mechanically by the operator 
PU X /(moC). 

We can now turn to the discussion of the physical meaning of Dirac’s 
first-order equation yy == 0. We shall note first of all that it can be 
written in the standard form 

(e+ p,))b == 9, (278) 


O 
where p, denotes the operator f : , multiplied by the four-dimensional 
=7t © 


matrix 5, and e¢ the first-order energy operator defined as the four- 
dimensional matrix 


c= O+ ely Urtry Uy +; Uz) +My C779 sia U+cy-u-myc*yo. (278 a) 


318 WAVE MECHANICS OF A SINGLE ELECTRON $3) 
The important point about Dirac’s equation—namely, its relativistic 
symmetry with regard to time and space—is revealed by the possibility 
of writing it in one of the three other equivalent forms: 


(P.—Pz)p = 0, (P,—p,) = 0, (F.—p.)b = 0, 
corresponding to the election of one of the space coordinates to the 
presidential role played in the usua] form of the theory by the time. 
Replacing the latter by the coordinate x, for example, we get for the 
corresponding ‘momentum operator matrix’, with the help of the equa- 
tion Ay = 0, the following expression: 


P,, = G,— ay Uy— a, U,— 0%; Uj — Ay Mo C, 
where G,, the z-component of the ‘potential momentum’ e/,,/c, is sup- 
posed to be multiplied by the unit matrix 8. The same refers, of course, 
to the operator p, = ~ = in the equation (P,—p,) = 0 (as well as 
T 
to the operators p, and p, in the two other momentum equations). 

If the operator « does not contain the time explicitly, then equa- 
tion (278) admits particular solutions y= ° e-‘27«4h for which it 
reduces to the form (e—e’)y,- = 0. These solutions represent different 
stationary states of the electron moving in a constant electromagnetic 
field. 

It can easily be shown in exactly the same way as in Pauli’s theory 
that functions 4 = 4, and y,, belonging to different energy values 
which form a discrete spectrum, satisfy the orthogonality relation 


[ oid. aV = 0, 
where yt. y,. => Pe eq. This enables us to build up a matrix repre- 


sentation of physical quantities and a transformation theory which 
differs from that based on Pauli’s equation by the fact that the addi- 
tional ‘spin-index’ « assumes four values instead of two. Another 
important difference consists in the fact that Dirac’s equation 
(e—e’), = 0 admits solutions corresponding to negative values of the 
energy «’. This circumstance will be discussed in more detail later on 
(§ 34). 

It may seem at first sight that the wave-mechanical expression 
(278 a), because it is linear in the operators u,, u,, u,, representing the 
components of the electron’s proper momentum, has no parallel in the 
classical relativistic mechanics. A similar expression is obtained, how- 
ever, on the Einstein theory if the proper energy mc* = myc*/,/(1—v*/c*) 


§ 31 EXACT FOUR-DIMENSIONAL MATRIX THEORY OF DIRAC 319 
is rewritten in the form 

My Cc? v? myv? 

mot esata) aes 7 NAHI, 

where § = mv is the proper momentum. Putting « = mc?+U and 
QV = 129. IJy+%,Ge, we get 

¢ = U+0, 9, +0, 9,+0,9.+mpcry(1—v*/c2), (279) 
which becomes identical with the »xpression (277 a) if we replace the 
proper momentum vector g by the uperator u, the velocity vector v by 
the vector-matrix cy, and the expression ,/(1—v?/c?) by the matrix yo. 
We shall write this symbolically in the form of ordinary equations: 

g =u, v= ¢y, J(1—v?/c?) = yo. (279 a) 
The startling point about these relations is the fact that the classical 
momentum and velocity are replaced by operators of an entirely dif- 
ferent type. This may be due partially to the variation of the mass— 
which is the proportionality coefficient between momentum and velocity 
—as a function of the latter. If, however, this were the only reason 
for the difference, we should expect the relation 
YoU = MyCY 

to hold—which of course is not the case (see below). 

The fact that the operators u and cy are the wave-mechanical repre- 
sentatives of the momentum and velocity vectors respectively can be 
established in a more direct and convincing way than has been done 
above. Let us consider the classical cquation of motion of the 
relativity theory in the Lorentz-Einstein form 


d 1 
ne = (E+ 2vx H). (280) 
Replacing the classical time derivative of g by the wave-mechanical 
expression d 2 
ae = yutleul, (280 a) 
we get, since u = p—G = p—eA/c, 
Ei ee 
| nt) 
and further, with the help of the expression (278a), for the energy 
oe [eu] = (yu), u}+ [0,0] 
(since y, commutes with u). Now 
oU ad 
[U, Uz] = [U, p,] SS ae 


Ox Ox 


320 WAVE MECHANICS OF A SINGLE ELECTRON § 31 


ane [(y-u) ’ u,| = Yel Uz, u,| +y,[ Uy up|+y[u-, u,| 


2m 
= hk [yy(uy Uz,—U, Uy) +y,(Uz U,— Uy, u,)} 


= “(ry H.—y, H,) = “(yx H)z, 


according to (218). We thus have 
[e,u] = e[—Vo+y x H], 
and consequently 


d 10A 
qe e| — 2 aver xl, 


or finally ou == e(E+yXH). (280 b) 


This equation is of exactly the same form as the classical] equation 
(280) with g replaced by u and v/c by y in agreement with (279 a). 

Another still more direct and conclusive proof that the operator cy 
is the wave-mechanical equivalent for the velocity is obtained by 
calculating the operators dx/dt, dy/dt, dz/dt which obviously represent 
the components of the vector v. We thus get 


dx 
ai = [e, 2] — cly2 Uz, x] 


(since all the other elementary operators constituting « commute with 
x), that is 


dz 
at = cyz[Uz, x] ann Cyz{ Pz, X] mm gs 
d 
a SF =o (280 c) 


which is the desired relation. 
The physical meaning of the operator cy as the representative of the 
velocity can be finally recognized from the fact that, with the expression 


p= oly (281) 
for the density of probability, following from (232), the expressions 
(232 a) for the probability current-density can be written in the form 

j = opty = optyty, (281 a) 
corresponding to the classical relation j = pv. We have, for instance, 
according to (271 b), taking the z-component of j, 

je = Hb ye = [Pye Pr tHE (yePs +93 (Ps tHe (rb)a] 
= [Petes tysde tery), 


§ 31 EXACT FOUR-DIMENSIONAL MATRIX THEORY OF DIRAC 321 
which coincides with the expression (2328) for j,. Since all the three 
matrices y,, y,, y, are Hermitian, we have y' = y, so that the two forms 
(281a) for j (with y acting on ¢ and y' on yf) are equivalent, being 
actually obtained from each other by the associative law of multiplica- 
tion. 

The expressions (281) and (281 a) can be derived directly from Dirac’s 
equation I'y == 0, and this in a much simpler way than without the 
use of the matrix notation. Multiplying, namely, this equation (on the 
left) by %t and subtracting‘from it the product of the adjoint equation 
~t'Tt = 0 by ¢# (on the right), we get 

U(TY)—WITy = 0, 
that is, since y} = y, and yt = y, 
P" (upp) — (Urb b+ pru-yy—(utp')yp = 0, 
or finally A (Why) +div cbtyp = 0. (281 b) 


This is the equation of continuity for the probability density and 
current density as defined by (281) and (281 a). 

The expression (281la) for the probability current-density can be 
transformed (according to Gordon) in the following way. Replacing ¢ by 
the expression — (B-u-+-8,u,)y/m,c, with the help of equation (272a) we 


have i 
j = of'yp = aes PL y(B-u) +B, wu), 
or, since y = 8,8, 
moj = —¥'B,B(B-u)y —y'B, BB, up. 
We have further, according to (276), 


By B-u oa Bx Br Uz +B, By Uy+BrB, u- = —u,+i(§,u,—&, uy), 


that is, B(B-u) = —u—iuxé& 
and BB, = —in. We thus get 
j= vtBuy + [ypu xB+nu)y]. (282) 
Mo Mo 


Transforming in a similar way the factor yt (instead of ) in the expres- 
sion j = Wty and adding the result to the previous one, we get finally, 
remembering that 


B} = 2, = Yo Ber= 6, ni = —n= vy, 
l h of h ; 
ase an ee ee (a Hr ey) ae si asmece 70 ny). 


of 
(282 a) 
3505.6 Tt 


322 WAVE MECHANICS OF A SINGLE ELECTRON § 31 
This expression multiplied by e/c (e = charge of the electron) gives the 
density of the electric current (in e.m. units). The latter can accordingly 
be written in the form 


1 2 1 @ 
<j = uf [tre(Vb)—W'V vo] Av yoy] + curl 8+ = 5, 
(282 b) 
where Mm = baby Ey 283 
and # = pbtyemb J ida 
The vector M must obviously be interpreted as the ‘magnetization’, 


i.e. the probable value per unit volume of the magnetic moment duc 
to the electron’s spin. Its components are expressed by the formulae 


My = wl (YT Pet PF oy)— (Phat ory) 
My, = yl (YT Po—VFT 1) — (PF da YF 5) ] ; (283 a) 
M, = wl (— PT oy +4F o2)—(— Psst Pt Ya)] 
If in these expressions we neglect the products of ¥%, with %, (which 
are small quantities of the second order in 1/c) they reduce to the 
expressions (252b) of Pauli’s theory. Splitting up the matrix y into 
two two-dimensional matrices ¥, x, we can rewrite (283a) in the form 
M = p(ytop—xtoy). (283 b) 
The vector 9 represents the ‘electric polarization’, i.e. the probable 
value per unit volume of the electric moment due to the electron’s 
spin. In spite of its imaginary appearance it is easily seen to be a real 
quantity. We have, namely, 
B, = — ty (PP. — fh, WE) + (orbs — Fe ¥F)] 
By = elo t hs oi —Ptys— be bs] (283 c) 
B. = ty (YT s—, YF) — (PT da— pe $i) 
which can also be written in the form 
® = —ip(p'oy— x'op) (283 d) 
corresponding to (283b). If x is replaced here by its approximate 
expression in terms of ¥ according to (260) we get, with the help of 
(260 a), — 
= me” uX op 


in agreement with our previous interpretation of the operator 


v= ~uxe 


as the electron’s real electric moment [cf. (261 b)]. 


§ 31 EXACT FOUR-DIMENSIONAL MATRIX THEORY OF DIRAC 323 

It is interesting to note that the magnetic moment is in Dirac’s 
theory specified by the matrix y,§ and not by the matrix § which was 
assumed to specify the mechanical angular momentum due to spin. 
This difference can be interpreted as the expression of the fact that in 
the classical theory the ratio of the magnetic moment to the angular 
momentum is equal to e/(2cm) for orbital motion or e/(cm) for the spin 
motion, where m is not the rest-mass, but the actua] mass m,/,/(1 —v?/c?). 
If, therefore, in wave mechanics the spin angular momentum is repre- 
sented by the matrix hE/47, then the magnetic moment must be repre- 
sented by the matrix ehy,§/(4m,) = py, since the classical quantity 
y(1—v?/c?) is represented by the matrix yo [cf. (279a)]. 


32. General Treatment of the Spin Effect; Angular Momen- 
tum and Magnetic Moment 


The fact that the spin angular momentum must be represented by the 
vector 8 = h&/47 can be proved in the same way as in the case of the 
Pauli theory (where § is replaced by oa). 

We have, to begin with, according to (275b), the following relations: 


Saty = ~£ ft. = 1&,, Sake = =F ty = 16, “Se — —€ ft. are gy, (284) 
and consequently ExXG a Zee, (284 a) 
so that the matrix § satisfies the same relations as Pauli’s matrix o, 
giving for the angular momentum s = «& (x = h/4m) the usual com- 
mutative relation sx s = —hs/(271). 

It should be mentioned that the characteristic values of the matrices 
E,,£,,§, are equal to +1 (each value occurring twice), while those of 
£2, &, € are equal to 1. The characteristic value of s* thus turns out 
to be equal to }(h/27)*, as before. 

It can easily be verified that the matrix y, commutes with ©. Since, 
further, its square is equal to 1, the preceding relations will hold for the 
matrix y,§ just as well as for §. The necessity of interpreting the latter 
and not the former as the spin angular momentum can be inferred in 
an unambiguous way from the fact that the sum of s = hE/27 and of the 
orbital angular momentum 


L=rxu, (285) 
that is, the vector M=L-+s 
satisfies the equation of motion 
oM =r~xF, 


where F = e(E+- x H) is the force acting on the electron [cf. (280b)], 


324 WAVE MECHANICS OF A SINGLE ELECTRON § 32 
and can accordingly be defined as the total angular momentum, while 
the vector L++,.8 does not satisfy this equation. 


We have in fact 
: d dr du 
ai —L= a xu+r x — an 
that is SL = cyxu+rxF, (285 a) 


according to (280b) and (280c). 
Replacing L by s, we get, on the other hand, 


d 
a = Key: U+Yo MgC, E], 
or, putting y = p&, since § commutes both with p and with y,, 
d 
Taking the z-component of this vector, we have 


= repli E+ tyler Eel) = FE xepl tte by ty €,) 
= cp(ux&), = cuxy 


d —8 
dt z 
according to (284), that is, 


d 
qe = evx4. (285 b) 


Adding (285a) and (285b), we get the equation 
d d 
ets) = iM = rk, (285 c) 


which coincides with the classical equation for the total angular 
momentum. In the case of a spherically symmetrical electric field and in 
the absence of a magnetic field the product r x F vanishes, so that the 
vector M is a constant of the motion. 
Taking the square of M, we get the expression 
M? = L?+2L-s+8? (286) 
which is also a constant of the motion. Now since 


is itself a constant, we get 
S(t 2L-s) = 0. (286 a) 


The two terms in the brackets taken separately are not constant; as 
has been shown, however, by Dirac, we obtain a new constant of the 


§ 32 GENERAL TREATMENT OF THE SPIN EFFECT 325 
motion, characteristic of the relation between L and s, if we consider 
the vector y,L-s = L-8y). Taking the time derivative of this vector, 


we get d at, d 
te’ S70) eke ‘S¥otL-= (Syo)- 


Replacing dL/dt by the expression cy xp, according to (285a) (using 
u = p andrxF = 0) we get 


“a = ¢(y Xp): = cxp(EXxp)-S = —cnp(§X&)-p 


= —2ickp—-p = —2icx(y-P), 
and consequently 


dL hy 
ai ‘89 = —2icK(y:P)yo = —cK 5 L(Y'P): Yo] 


since y anticommutes with yo, or 


dL __ hh _ _ WH dy 
dt SY) = «>-[e Yo] es ae 


We have further 

5 (970) = —(YXP)yo+ isc(p-Y)¥o = [—(Y XP) + (Py) ]evo- 
Now E(p-y) = p(p-6) = p(p+ip x) = ppt+ ip xy, 
80 that 5 (870) = ippcyo, L587) == 0, 


since L-p = (rxp):p = 0. We thus get 


hat i ba a at 287 
that is, ( stsa)n = const. = ra (287) 


where i: is an ordinary number, replacing the angular quantum number 
of the old theory; the fact that it can assume integral values only will 
be shown later on. 

Taking into account the identity 


(LE)? = L9+i(LxL)€ = B-*Le = L—2Ls 
and rewriting (287) in the form 


LE=S(k-1) (=), 


326 WAVE MECHANICS OF A SINGLE ELECTRON § 32 


we get further L?—2L:s = (5-) (rok 1)3, 
h\? h\2 
that is, B= (;-) Oe, ey ee (;-) i{ik—y,); (287 a) 
2Qr Qa 
and L?42L-s = (5) @—2 ) = const, (287 b) 
vis 


in agreement with (2864). Adding to both sides of this equation the 
term s* = }(h/27)?, we obtain finally 
M = (5) @—». (287 c) 
2a 
The latter expression is usually written in the form 
2 
m= (5-) i+), 
7 
where j = |k|—} is the so-called ‘inner’ or ‘total’ angular quantum 
number. 

An angular quantum number of the same character as that which 
in the Schrédinger theory specifies L according to the formula 
L? = (h/27)*l(l+1) does not exist in Dirac’s theory, since LZ? is not 
a constant of the motion—as shown by the formula (287a). It should 
be noticed that the number k can assume both positive and negative 
values (which can be interpreted as corresponding respectively to the 
same or to the opposite orientation of the orbit and spin axis), the value 
k = 0 being obviously excluded [as seen from (287 c)]. 

The preceding results, which are strictly valid for the motion in a 
spherically symmetrical electric field, remain approximately valid in 
the presence of a weak homogeneous magnetic field. Such a field %, 


which can be derived from the vector potential A = 4 xr, corresponds 
to the additional term S,, = —(e/c)A-cy = —}e(S xr)-y, that is 


e 
Sm = — 5,9 (tr xcy) (288) 


in the energy operator «. This additional term can be identified with 
the ordinary expression for the magnetic energy if the vector 


= Srxey= ser xy (288 a) 


is defined as the total magnetic moment of the electron. 
We have in this case, according to (285a) with F = ey x9, 


d 
aM = er X (¥ XH). 


§ 32 GENERAL TREATMENT OF THE SPIN EFFECT 327 
With the help of the equation 


d dr d 
fiex (rx) = x (xH)+ex(Frx9) 


we get, neglecting the left-hand term (since its time-average value 


vanishes), yX(rxH)+rx (yx) = 0, 
whence, using the vector identity 
rx (yx H)+y¥ x (GXr)+Hx (r xy) = 0, 


EM = delrX1) XH = BX (288 b) 


in agreement with the classical theory. 
Taking the scalar product of both sides with , we get 


¢ (MS) = 0, (288) 


which means that the projection of the angular momentum in the 
direction of the magnetic field remains constant. 

The formula (288 a) corresponds to the classical formula p = ser x v/c 
for the orbital magnetic moment due to the electron’s translational 
motion alone, without any spin. According to the considerations de- 
veloped before in connexion with the spin magnetic moment py,§ one 
might expect that the total magnetic moment would be expressed as 
the sum 


e e h 
——_-_ y,rxu = u-+-—S}. 
mae +pyok Taare +325) 


This expression is, however, not exactly equivalent to the expression 
(288 a). 

In order to transform the operator » to an equivalent form of 
the above type, we shall consider its probable or average value 
§ btpp dV, which can obviously be written in the form 


z=< | rxjdV, 
where j = cs'yy is the probability current density. Using the expres- 
sion (282 b) for ej/c, we get 
ae eet t be 2 
i ra® |v yor X up dV + i [ rxcurla av +3 [exZe av. 


Now the first integral is equal to the probable value of y,L. With the 
help of the vector identity 


V(A-B) = (A-V)B+(B-V)A+A Xx curl B+B x curl A 


328 WAVE MECHANICS OF A SINGLE ELECTRON § 32 
we get further 


rx curl M = V(r-M)—(M-V)r—(r-V)M = V(me-r)—M—r = om, 


since 
a e d a 
1 = V = :-V = F— —_ —_—- =—-r—. 
curlr = 0, (M-V)r = M, (r-V) ta tla ta Pe 
In the latter expression 4/ér denotes a partial differentiation with regard 
to the distance from the origin of a polar coordinate system, the two 
angular coordinates being kept constant. Writing the volume element 
dV in the form r? drdw, where dw denotes the element of solid angle, 


we have 
a _ 5) 
[rgmav = [dw fr 2 sm dr 
0 


= | do f Stem dr — 3 [ dw [ sme? dr = —3 { mav. 
0 0 


Consequently, 


sn Oo see 
$f rxcurlm dV = [ mdV = yk = inge0® 
We thus see that so far as its probable value is concerned the opcrator 
w is equivalent, at least in the case of a stationary state when the 


expression 3 
= i rx qs dV 


vanishes, to the following one: 
e 
Bott = Snag OE): 


This ‘effective’ magnetic moment can be replaced approximately by 
the expression . 
Pott = ae (L+ 2s) 


not involving the factor y), which accounts for the variation of the 
mass with the velocity, and whose probable value differs by quantities 
of the second order in 1/c from 1. 

The fact that the expression (288 a) does not contain explicitly the 
spin contribution to the magnetic moment shows very clearly that 
the ‘spin-motion’ has no real existence as something independent of the 
translational motion, but is actually a certain aspect of it. This cireum- 
stance can be regarded as a consequence of the fact that in Dirac’s 


§ 32 GENERAL TREATMENT OF THE SPIN EFFECT 329 
theory there is no direct relation between the vector u = p—eA/c 
representing the proper momentum of the electron (mv) and the vector 
cy representing its velocity. These two vectors cannot be treated accord- 
ingly as parallel to each other. In fact, the lack of parallelism, as 
measured by the vector product cy xu, can be considered according to 
equations (285a) and (285b) as the cause of the change of the orbital 
and spin components of the angular momentum in the absence of 
a magnetic field. 

The fact that the electron’s spin is not an independent kinematic 
property but merely an aspect of the translational motion (resulting 
from the divorce between the velocity and momentum) is indicated 
also by the relation (277 b) between the matrices y and § representing 
respectively the translational and the ‘spin’ velocity. If the propor- 
tionality coefficient p were an ordinary number, then the relation y = p& 
would imply that the two vectors represented by y and & were parallel 
to each other. Since, however, p is a matrix, such a parallelism does 
not necessarily exist, as may be seen from the calculation of the pro- 
bable values of y and &. 

It should be mentioned further that the characteristic values of the 
matrices y,, y,, vy, are the same as those of £,, £,, ., that is, +1 and —1 
(each of them occurring twice). This means that the characteristic 
values of the components of the electron’s velocity as defined by the 
vector cy are equal either to +c or —c. We have here the same type 
of duplicity as in the case of the electron’s spin. For the components 
of the momentum as represented by the vector u we get a continuous 
spectrum extending from —oo to +00, as in the classical theory. The 
same would refer to the velocity if the latter were defined not by the 
vector cy but by the vector y,u/m,, corresponding to the classical 
relation between velocity and momentum. Such a definition is, how- 
ever, inconsistent with the relations dz/dt = cy,, etc., derived above. 
It has been shown by V. Fock that, in spite of this, the two definitions 
of the velocity become identical in the limiting case when the quantum 
theory reduces to the classical one (for instance, in the case of a motion 
with very large energy). 

The relationship between the translational and spin motion can be 
interpreted according to Bohr as a particular case of Heisenberg’s 
uncertainty relation, resulting from the consideration of the magnetic 
force experienced or produced by a moving electrified particle without 
any actual spin. 

The magnetic field produced by such a particle (electron) at a distance 

3606.6 vu 


330 WAVE MECHANICS OF A SINGLE ELECTRON § 32 
r is given by the well-known Biot-Savart formula: 


Now the exact determination of § according to this formula requires 
the simultaneous knowledge both of the position, i.e. the radius vector 
r of the electron (drawn from the point P for which § is to be deter- 
mined) and its velocity v. This is, however, impossible, since it is only 
possible to measure both quantities at the same time with a limited 
accuracy, so that the products AvAv,, etc., are at least of the order of 
magnitude of h/m». This implies an inaccuracy 
ay Oe Ee _ eh 
ase a=s (am 

in the determination of §, which can be interpreted as an additional 
magnetic field (of unknown direction) due to a particle with a magnetic 
moment p». The superposition of the magnetic field produced by the 
electron’s spin on that due to its translational motion thus secures the 
validity of the uncertainty relation between position and velocity, so 
far as they can be determined from the electron’s magnetic action. 

A similar result is obtained if we consider the force F = ev x S/c 
experienced by an electron in a given external magnetic field. The 
inaccuracy Ar in the electron’s location leads to an inaccuracy 
AS = (Ar-V)§ in the estimation of the field strength §. Replacing v 
in the preceding formula by the corresponding inaccuracy Av, we get 


~~ é A ~ eh ~ 
|AF| = “Avx Ar-VG = om,¥® ~ pVS, 


which agrees, with regard to the order of magnitude, with the force 
acting on a magnet with moment p in an inhomogeneous field [(uV)@]. 


33. The Motion of an Electron in a Central Field of Force; Fine 

Structure and Zeeman Effect 
We shall now turn to the more detailed discussion of the problem of 
the motion of an electron in a spherically symmetrical field of force 
according to Dirac’s theory. 

The function quadruplet ¥,, ¥, 3, Y, corresponding to a definite 
energy-level « = e’ can be determined in a general way from the equa- 
tion («e—«’)y = 0. In the case under consideration it is, however, more 
advantageous to start not with the energy but with the angular con- 
stants of the motion and specify the functions so as to make them 
the characteristic functions of the corresponding operators. 


§ 33* MOTION OF ELECTRON IN CENTRAL FIELD OF FORCE 331 

The most suitable operators for this purpose are M.—the projection 
of the angular momentum opcrator on one of the coordinate axes—and 
the operator Jf?; the operator L?, although it is not an exact constant 
of the motion, can also serve for the determination of w. 


h 0 

] Ss 9 
ak oe aye, £. and the definition (275 b) of 
g., the following system of four ordinary equations: 


la ; 1é ’ 
1 ay = otf, 1 Me td, = ote 


we get, from M, = L,+8, = 


1 ch cd 
: a Ab, = C'Py, : ah Yq == Cy, 


where c’ = 27M:/h is a constant. An immediate consequence of these 
equations is that the dependence of the functions ¢, %, on the longitude 
¢ is the same as that of the functions ¥,. Y. This dependence is 
obviously given by the formulac 

Y= 7 Ue .= A,eimd \ 

pz = B, elimi de, oy = B, ein? f 
where A and B are functions of the co-latitude 6, with c’ = m+), 
that is, 


(289 a) 


M: = + (m+)), (289 b) 
2a 


m denoting an arbitrary integral number. 

The determination of the functions A, B can be carried out in the 
simplest way by applving to 7 the operator L*. This gives, according 
to the relation (287 a), 


La = (5) Mey, LM = (5) ka—IWh 
ne , (290) 
Lj (E) Me+IMs Lede = (55) + Di 
10 0 90 
, 0 | v0 vO 
aii %=\y 0-1 of 
00 0-1 


Equations (290) show that the functions y, so far as their dependence 
on the polar angles 0, ¢ is concerned, are spherical harmonics, just as 
in Schridinger’s theory. (It will be remembered that L* = —(h/27)*Q?, 
where 2? is the Laplacian operator on the sphere, and that the equation 


332 WAVE MECHANICS OF A SINGLE ELECTRON ”§ 33 
Q24+1(l+1)% = 0 is satisfied by spherical harmonic functions of the 
order / >> 0.) They show, moreover, that the function pairs ¥,,%, and 
Ws, %, are spherical harmonics of different orders, and that the number 
k which determines these orders can have integral values only. We 
must distinguish two cases, namely, k > 0 and k < 0. In the former 
case we get, putting k = 1+ 1, with regard to (289a): 
fy = Gy FY, mii(9, ?) = Oy FPns 1(Betm+e 
Pe = Ae FY, ,(9, ¢) = a, FP,,,(0)e'"? 
Py = 83 GY 41,m41(9,$) = 23 GPisrmsr(Bem YF 
Pe = 04 GY p41 n(9,$) = Os Prs,m(Oe™* 
where F and G are two unknown functions of the distance r alone, 
while a@,, a, @3, @, are certain numerical coefficients. F,,(@) denotes the 
associated spherical harmonic function = sin'”'6 P{'™)(cos @). 
In the case k < 0 we shall put 1 = —k = |k|, which gives 
by == by F¥ jm = 5 PPh 1 (Oe *% 
if. = be BY = be FP,,,(0)e"? 
bs = bs FY oy mar = b3 FPay ins (O)elm iP 
by = bg FY am = Og FF m (Oe? 
where by, bs, bs, bg are another set of coefficients. 
The number 
L=k—1 (k>0) or l= —k (k< 0), 
i.e. the order of the spherical harmonic functions appearing in the 
principal pair ¥,,¥%, is called the angular quantum number of the state 
in question. The two states specified by the functions (290a) and 
(290 b) can be distinguished by their inner quantum number ) which 
is equal to /+-} in the first case and to /—} in the second (i.e. in both 
cases to the arithmetic mean of the orders of the spherical harmonics 
in ,,%, and x5,%,). The two states belonging to the same j and to 
different values j+4 of J are specified by functions of the type 
Pi te~ Vyas Ya ta~ Ny and Yyp2.~ V4; Ys, ~ Yj, respec- 
tively. 
The ratio between the coefficients a,,a, on the one hand and a,,a, 
on the other can be determined from the equation 


ar—My=0 [a= (5) @—»}, (291) 


which can serve for the complete determination of the angular factors 
in the quadruplet ¥ (inasmuch as the direction of the privileged axis 


, (290) 


; (290 b) 


§ 33 MOTION OF ELECTRON IN CENTRAL FIELD OF FORCE 333 
remains unsettled). It is somewhat simpler, however, to combine for 
that purpose the equations (290) and (287). Putting L = hA/2z, we 
can rewrite the latter in the form 

AE = ky,—1, (291 a) 
which is equivalent to the system of equations 


(A,+tAy)p2—A, py = (k—1)yp, 
(A,—1A,) Pi tA, pf, = (k—1)ip, 
(A,+1A,)p,—A, Ps = —(k+ I), 
(Az—tAy Pa tA, py = —(k4- 1), 


(292) 


We have here 


or in polar coordinates 7, @, ¢, 
: “phe a : ; ae é 
A,+th, = e@(" + icotaZ ee ae g =), 
rttA, =e (jticotas), A,—tA, == e ag t toot =) 
la 
Au = i ap” 
The first two expressions can be obtained as follows. We shall put for 


the sake of brevity d/éx = 2,, etc. We shall further introduce the com- 
plex variable w = x+7y and the corresponding derivative 0,, = 0,+10,. 


We get then A,+iA, = 28,,—w?,. 


On the other hand, we have 


seat se “ha 6, = cot 6(x2,+- yd, )—tan 0 22,, 


a ey 
and 20, +-y0, = w*d,,— id, = wd +104, 
whence 


i, = = (x0,+-0,—A,) = — | (@+ tan 6 z0,)—cot A,], 


w* ost 
and consequently 


A, +tA, = acu gl (29+ tan 620 ,) + cot 6 184|—wa,. 


Since = tan@ = w= ef, 
|w| |w| 


we find finally A, +A, = e'4(89+1 cot 64). 


334 WAVE MECHANICS OF A SINGLE ELECTRON § 33 
We thus get in the case of the functions (290a) (k = 1+-1): 


7) 
a4(— moot )Pin = 4(1-+m+])Pimer 


a,(55 pt (m-+1)00t 8) Pines = —A2(0—m) Fm 
(292) 


a,(—meot 8) Psa = —a5(l—m+1)Pigi mat 


as (5 + (m-+1)cot A) Fresmes = a,(1+-m+2)Piam 


These equations can be used not only for the definition of the ratios 
@,:@, and a,:a, but also for the determination of the ‘associated’ 
spherical harmonics F,,,, etc. (supposed to be normalized in the same 
way for all values of / and m). Eliminating F;,,,, between the first 
two equations (2928), we Aes for instance, 
d? 
Fjatim cot 8m a+ |W) ag) Fim = = 0, 
which is the standard equation for the fnatizen | 
In the case (290b) we get with k = —/ a similar set of equations, 
namely, 
bs (5-" cot 0) Pin = —b,(l—M)Pimas 
by + (m-+1)eot A) Pier = O4(l-+m+1)Pim 
(292 b) 


ba(5p—™cot 8) Pim = Bal) Frama 


F) 
b, (spt (m+ 1)cot 0) P-ames = —b),(l—m—1)P-sm 


We shall not write down the explicit expressions for the coefficients 
a, b (which depend upon the way the functions P are normalized), and 
shall now turn to the investigation of the radial factors F, G and the 
associated question of the characteristic values of the energy e. 

The functions F and G@ can be investigated by transforming the 
equation («—e’)y = 0 to polar coordinates and getting rid of the angular 
factors in ys with the help of the preceding expressions. 

To carry out this transformation we multiply the term y-u in « by 
the square of the ‘radial projection’ of the vector y: 


1 
Y= SY. (293) 


§ 33 MOTION OF ELECTRON IN CENTRAL FIELD OF FORCE 335 
Taking into account the general relation, 
(y-A)(y-B) = (6-A)(E-B) = A-B+1(AxB)&, 
we get y2 = 1 and, further, 
] , 
YY = _(ru+sL-6), 


whence yu= yyu= “(ru+iL-8). (293 a) 


Now for a spherically symmetrical electric field we have u = p, and 
consequently 
£ diget 7) at 2 
Qa Or 
We thus get, with the help of the equation L-E = h(ky,—1)/2z, 
C= magia) tmetet U, 


so that Dirac’s equation reduces to the form 


|y-(5— = *\+ $2 (egypt U—e' )|¥ = 0, 


where €, = m,c?. 
Since the operator-matrix y, commutes with ¢/é@r and 1/r, and anti- 
commutes with yo, 


(5+ etl, b+ lore—e+ UM = 0. (294) 
By the definition of the matrices y,, y,, y, [ef. (273 b)], we have 
] : ] F 
(¥-%)1 = ~L(a+ty)e— Hs], (yb), = —[—ty)bst apa), 


OW = [le+iya as], eH a = 2[e—ivy taba) 
or, putting Y, = $1, te = de, os = Xv Ye = Xe, and 
o, = “(1), 


(Ye) = (6X) (Yr )2 = (0 x)e> (Yeh)s = (9b) (¥ Pla = (0,4). 
The equation (294) is thus equivalent to the following two: 


(5 wai 0.x) — = (e'—eq—U$ = 0 
(294 a) 


(S+* ="\(c.$)— "5 (€'e¢—U)x = 0 


336 WAVE MECHANICS OF A SINGLE ELECTRON § 33 
The latter equation can be multiplied by o,, giving, since o, commutes 
with 0/ér and since its square is equal to 1, just as for y,, 


(24 *S7)b— Fe +eo—U Nox) = 0 (294 b) 


The equations (294a) and (294b) serve for the determination of the 
functions ¢ and o, x. It should be remembered that each of these func- 
tions represents a pair of ordinary functions. We thus see that the two 
functions of each pair have the same radial factor, in agreement with 
our previous results. Putting 


¢=F(r),  9,x = 1G(7), 


we obtain the following system: 


(54° P+ e+o—U)E = 0 


14k\, 2 eee) 
ae z 
(5+ : ‘\e-F —e,—U)F=0 
‘ ; . d -1 ld 
Using the identity (5+;)F a a (rF), 
we have 
Cae +e 2 Gb Wyn D 
(296) 


d ik Qr,, = 
(5, +;)9- He —a—UIs = 0 
where g=rG, f=rF. 


We shall solve these equations for the particular case of the hydrogen- 
like atom, i.e. an electron moving in a Coulomb field with a potential 
energy U = —Ze*/r. We shall assume that e’ < ¢), which corresponds 
to a bound electron (H’ < 0) and leads to a discrete set of energy-levels. 
Putting, for the sake of brevity, 


27,, 8 a 2nZe* 
5 (+60) = z (0 e') = PB, or > ele 


we get for this case 


(296 a) 


§ 33 MOTION OF ELECTRON IN CENTRAL FIELD OF FORCE 337 
For large values of r these equations reduce to 


a dg ger — 
a iad Fag i eae 


giving the following asymptotic solution: 
be Ae-oBr, ¢= Be-«Pr 
AB = Ba 


where A and B are considered as constants. 
To get the exact solution we replace them by polynomials 


A = Ayr++A,reti+...tA,rhis, 
B= Bor++B,r+1+...4 Br +8, 


' (296 b) 


obtaining the following relations between the coefficients: 
A,(u+n—k)+yB, = oBA,-,—0°B,-, 
B,(ut+n+k)—yA, = o8B,»—B?A,_, 
Multiplying the first of these equations by f and the second by a and 
adding the results, we get 
A,[B(ut+n—k)—ay)+ B,[a(ut+n+k)+ py] = 9. (297 a) 
The ‘boundary conditions’ A, = B, = 0 for n = —1 and n= s+1 
applied to (297) give 
A(u—k)+yB,= 0, By(utk)—yA, = 0; 


(297) 


BA, = aB,. 
Eliminating A, and B, between the first two equations, we get 
p= +y(k@—y?%). (297 b) 
A, V(k?—y?)+k Y 
The ratio a ee es Fe ey (eb Mice 
By Y k—.{(k®?—y?) 


which follows from the preceding equation, is identical with that which 
is obtained from (297a) for n == 1. With n = 8 we get, on the other 


hand, A [A(u-+-s—k)—ay]+ Bfa(ut+s-+k)+By] = 0, 

which becomes identical with BA, = aB, on using the condition 
2xB(u+s) = (a?—B?)y. (297 c) 

With the above definitions of a, B we get 


Veb—e'2(u-ts) = €’y, 


2505.6 xx 


338 WAVE MECHANICS OF A SINGLE ELECTRON § 33 
that is, from (297 b), 


e =e [i+ pie Se |? 
"LS s+y(k—y)] 
This is exactly Sommerfeld’s formula (264) (with yZ replaced by y).t 
The angular quantum number & has the same meaning in both cases, 
so far as the value of the energy is concerned. It must be remembered, 
however, that in the previous theory it was supposed to be essentially 
positive, whereas in Dirac’s theory it can assume both positive and 
negative values (zero excluded). With k >0 we get 1 = k—1 and 
j=l+}4= k—k, ie. a solution of the type (290a); while in the case 
k <0 we obtain a solution of the type (290b) with 7 = |k| and 
j= \k\—}. 

It should be emphasized that the two solutions are characterized not 
only by different angular factors, but also, as is plainly seen from (297), 
by different radial factors F == f/r and G = g/r; their similarity is 
restricted to the value of the energy and of the z<-component of the 
angular momentum Jf,. 

The coincidence of the energy-levels corresponding to opposite values 
of k is a characteristic feature of the motion in a purely Coulomb field 
of force. If the motion of the electron takes place in a field even 
moderately deviating from the latter, due, for instance, to the variable 
shielding action of the inner electrons in an alkali atom, the energies 
of the states +k and —k become different and we obtain what is called 
a ‘screening doublet’. The two levels of such a doublet state belong 
to two different values of the Schriédinger angular number /, namely, 
1 = |k|—1 and J = |k|, and to the same value of the inner quantum 
number j = |k|—}4. It should be mentioned that in the case of small 
values of j the separation between the two energy-levels in alkali atoms 
or ions of a similar structure is so large that they are no longer con- 


(298) 


t If instead of Dirac’s equation we used the relativity second-order equation Dy = 0, 
in the present case 
4n? ,, Zer\* 
ve fal(e+S)'-a}e=o 
not involving the spin, we should have obtained a solution of the same type 
¢ = F(r)YVim(9, $) 
as in Schrédinger’s theory, with 


‘ 
rF = f= e-# > bart 
a0 


’ on rshepest tne As ; antes - 
ae ‘= Dt 
corresponding to half-integral values of the radial and angular quantum numbers (2— } 
instead of s, and /+ } instead of /). This result is, however, contradicted by the experi- 
mental data, which are in agreement with Sommerfeld’s formula. 


§ 33 MOTION OF ELECTRON IN CENTRAL FIELD OF FORCE 339 
sidered as forming a doublet and are referred to different series. This 
notion can, however, be conveniently applied to X-ray absorption levels: 

The two levels corresponding to the same value of the Schrédinger 
angular quantum number / and to consecutive values of the inner 
quantum number j = /—} (k = —1l) and j = 1+-4 (k = 1+1) are said 
to form a ‘relativity doublet’. According to Sommerfeld’s formula (298) 
they correspond to consecutive values of the old angular quantum 
number |k| (=/, 1+1). Since in the Bohr-Sommerfeld theory this 
number determined the eccentricity of the elliptical orbits, the relativity 
doublets were associated with orbits of different eccentricity. From 
the point of view of the present theory, the relativity doublets should 
be associated rather with orbits of the same size and eccentricity but 
with opposite orientations of the spin. Such relativity or ‘spin’-doublets 
are extremely narrow in hydrogen or ionized helium, but they become 
very broad in X-ray spectra, their width increasing roughly as the 
fourth power of the effective nuclear charge [according to the approxi- 
mate formula (264¢c)}. They are rather broad, too, in the spectra of 
alkali atoms and other complicated systems with one external electron. 
In this case, however, they are due not to a large effective nuclear 
charge, but to a rapid variation of the latter, owing to the decrease of 
the shielding effect of the inner electrons when the outer electron 
approaches the nucleus.—-Sommerfeld’s formula is, of course, inap- 
plicable to this case, which is characterized by a large Al-separation 
(‘screening effect’) and a relatively small Aj-separation (‘spin’ or 
relativity effect). 

To a given valuc of k (i.e. of 1 and j) there corresponds a degenerate 
set of states specified by different values of the axial quantum number m 
or of the number m, = m--} which determines the z-component of the 
total angular momentum. This degeneracy is of exactly the same 
type as that discussed before in connexion with Schrédinger’s theory; 
it can be pictured as due to the possibility of 2)-+1 = 2|k] quantized 
orientations of the angular momentum vector with regard to the z-axis, 
corresponding to all half-integral values of m+} between +7 and —j. 
We have in fact in the case k > 0 a set of function-quadruplets % with 
the following angular factors Y,_1m+1 Ye—1m3 Yem+1 Ym The maxi- 
mum or minimum admissible value of m is that for which one function 
at least of each pair is different from zero. We thus get m < k—1 and 


m > —k, ie. js ea 2 ak 
A similar relation with i replaced by |k| is obtained in the case k < 0. 


340 WAVE MECHANICS OF A SINGLE ELECTRON § 33 

Thus, for example, in the particular case k = 1, 1 = 0 and j = 3, 
which corresponds to the normal state of the hydrogen atom (n = 1; 
it should be mentioned that the case k = —1, i.e. 1 = 1, corresponds 
to an excited state n > 2) we actually obtain two sub-states specified 
by the following expressions for the functions y,,..., p4: 

Y= RY, («= 1,2,3,4), 
with the radial factor 
Rr) = r¥O-7")-le-We, 


and the angular factors 


Y,=0 ¥,=—1, Y= cece inde 
Ye coat cea viene OBO 
. iFya=y4) 
in the case m = 0, i.e. m; = +4, and 
ae, 2 “ ty 
¥y,=-—1, ¥4,=0, ¥;= ae ay wee cos 6 
a. ae 
Y, iy 
in the case m = —1, i.e. m; = —}. The two states correspond to the 


same value of the inner quantum number j, namely, 7 = 4. They 
are associated with the same spherically symmetrical distribution of 
the probability density, which is proportional to the square of the 
radial factor R(r). It should be noticed that this factor becomes 


infinite at r = 0, but in such a way that the integral { Rr? dr remains 
0 


convergent. 

The difference between the two states consists in the fact that for 
the first of them the spin axis of the electron is pointing in the positive 
and for the second in the negative direction of the z-axis, as follows 
from the approximate equation for the characteristic values © 


a, = any 
with ¥, = ¥, = 0. 

We must consider in conclusion the modification of the states, and 
in particular of the energy-levels, of a hydrogen-like or an alkali-like atom 
in the presence of a homogeneous magnetic field 5 (Zeeman effect). 
In the former case we have to deal with a twofold (k, —k) degeneracy, 
corresponding to the absence of any screening cffect. This degeneracy is 
to be taken into account for very weak magnetic fields only, so weak 
that the product » is very small compared with the relativistic (Aj) 


§ 33 MOTION OF ELECTRON IN CENTRAL FIELD OF FORCE 341 
separation. Jn the latter case, on the contrary, the relativistic splitting 
is as a rule much smaller than the screening (+4) separation, so that 
for fields of moderate strength the only degeneracy present is that 
which corresponds to different values of the axial quantum number m. 

It can easily be shown that the characteristic functions ¥ corre- 
sponding to this privileged character of the z-axis in the absence of 
a magnetic field are such that the non-diagonal matrix elements of the 
magnetic perturbation energy 

S = he rxy = teH(ry,—yyz) (299) 

all vanish. So long as the magnetic field is sufficiently weak the addi- 
tional energy due to its action can be determined accordingly as the 
diagonal elements of S with regard to the corresponding unperturbed 
states. 

The additional magnetic energy of a state specified by the quantum 
numbers k, m is thus given by the formula 


Mein = Shem; = f thm Stem 2V (299 a) 
Dropping for the sake of simplicity the indices k, m, we have, according 
to (299), 
(Sp), = heHilz-+iyhy, — (Sy), = —heHi(x—ty)}g, 
(Syp)g = heHi(zt+ty)fe,  (Sp)y = —feHi(z—ty)py, 


and consequently 


BIS = eh [ative (e—iy Why + (x+y Wa (eta a] 
= ~eGRo (etiyit yi) 


or ptSy = —eGRorsinde#(Yret ya). (299) 


Substituting here the expressions for the functions % derived before 
and integrating, we get 
Mein 
= Ineh [dr F(r)@(ryr? [ i(afay Pram Pimsrt4F4e Prermsa Prm)sin’d a0 
(299 c) 
in the case of the equations (290 a) and a similar expression in the case 
(290 b). 
The radial factor in this expression can easily be calculated with the 
help of the differential equations (296) which are satisfied by the func- 
tions rF = f and r@G=g. Taking the first of these equations and 


342 WAVE MECHANICS OF A SINGLE ELECTRON § 33 
putting approximately «’+-e,—U = 2€9, we get 


whence 


Pp FG? dr = | for arm ft-(k { Par— | tpt), 


ao Lo] 


| a, ff ad/P\,_ _ ff 
or since | a dr = {z(4) dr = | % dr, 
0 0 0 


v 3 ~ ch” : 2 — he : 
[ Per ar +8) [Par = po een 
0 


if the function f(r) is appropriately normalized (f f? dr = 1). 
The angular factor in (299c) can also be evaluated without much 
trouble with due regard to the normalizing conditions for the functions 


P(@). 
We obtain in this way (neglecting terms of the second order in 1/c) 
, __ &h ee 
Jem = Fam_o9 2" + 4) — pHg(m-+ 4), (300) 
: k td 
with =-——— (k > 0), =... (b <0), 300 a 
pay HH 9 = By F< — (8008) 


in agreement with the results obtained at the end of § 30 (if m+ 4 is 
identified with m,). 

The integration of the expression (299c) requires a great deal of 
calculation. This can be avoided, however, if we replace the operator 
M by the operators 

Morr = Image +28) or Mi, = mace (L+ 2s), 
which have been shown in the preceding section to be approximately 
equivalent to it and to each other with an accuracy of the second order 
in l/c. To the same approximation we can replace y, in the expression 


2 2 
(287 a) by 1, with the result L? = Rr k(k—1) se U7+1) when k > 0. 
4a 47? 


2 
Combining it with the equation M? = (=) j(j+1) and putting 


S = (g—1)M, we obtain, with the help of (267 a) and (289b), the above 
approximate expression for A¢,.,,. 

The preceding theory is applicable only to a comparatively weak 
magnetic field. When the shift of the energy-levels produced by the 


§ 33 MOTION OF ELECTRON IN CENTRAL FIELD OF FORCE 343 
magnetic field becomes of the same order of magnitude as the Aj-doublet 
separation, the spin perturbation to which this separation is due must 
be taken into account together with the magnetic perturbation. 

We must start in this case with the two unperturbed states of equal 
energy €;,,, Specified by the same values of / and m and belonging to 
the values j == 1+} of the inner quantum number. The combined spin- 
magnetic perturbation S = S,,+5S,, produces a splitting-up of the 
unperturbed energy-level into two levels ¢},-+Acj,;, according to the 
equation S4,—Ae’ Sis 

Ss; Soo— Ac’ 
where the index 1 refers to one of the two degenerate states (j = 1+4, 
say), and the index 2 to the other (j = /—}). 

The non-diagonal elements of the spin perturbation (S,,),. and (S,,)e1 
must obviously vanish since the states 7 = 1+} are stationary in the 
absence of the magnetic field. The diagonal elements (S,,),, == A,«’, 
(S,,)ee = A,e’ can be defined therefore as the additional energies due 
to the spin perturbation alone, their difference 5 = A,e’—A,e’ being 
equal to the Aj-doublet separation in the absence of the magnetic field. 
The action of the latter can thus be determined by the equation 


, , 
Smu—Aye€ —Ae Smiz 


= 0, 


S net Sin2g— Ae €’ — Ae’ aie sade 
where 
Sau = pet as (mt 4), S22 = ——(m-+ $) 
oe agile an pus 
and Bos = Ss wD a+) 


The first two expressions are given by (300); the expressions for S,,,, and 
Sno, can be derived in a similar manner [see § 20, equation (155 b)]. 

It is customary to refer the displaced energy-levels «; and «, to the 
‘centre of gravity’ of the doublet, i.e. to the energy «, determined by 
the formulae ef = e+ (IE)8, 


€, = e,—/B 
[5 = (21+1)B = e;—e,]. Putting A,e’ = (1+1)8, A,«’ = —/B, and 
e’—e, = Ae’, we obtain from (300) the following equation for Ae’: 
(Ac’)?+[B-+p(2m-+ 1) ]Ae’—1(+- 1)8?+(uH)?m(m-+ 1) = 0. 
Its solution runs 
Ac’ = —}8—pO(m+ 3)+ 
+{l4 (m+ 3)+48])+ 61+ 1)l—p?H2m(m+ 1}. (301 b) 


344 WAVE MECHANICS OF A SINGLE ELECTRON § 33 
If the magnetic field is very weak, we get, in the first approximation, 


Ae! = —48—w9im-+1)+ | AC+H) +4487], 


I+} 
ie. de’ = —AU+1)—n6) 4 (m+4), 
and de! = +Bl—vS p-(m-+4), 


in agreement with (300). In the opposite case of a very strong magnetic 
field—so strong that the doublet distance 5 is small in comparison with 
the splitting 1. due to the field alone (when 5 =- 0)—the formula 
(301 b) reduces to 

Ac’ = —pH(m+4)4-44H = —pHO(m+)), 
ie. to the earlier formula (266 b) which determines the normal Zeeman 
effect. 


34. Negative Energy States; Positive Electrons and Neutrons 
We have seen above that in Pauli’s theory the two values « = 1 and 
a = 2 of the spin-coordinate refer to the two opposite orientations of 
the electron’s spin or magnetic axis parallel to the z-axis. One might 
be inclined to think that the values a = 1,2,3,4 of the Dirac theory 
refer to four different orientations of the electron. This is, however, 
not true. Taking the probable value of the spin angular momentum in 
the z direction we get, according to (275b): 
h 


8, = reas — 2. { (—PT hi tor be— sys t+yty,) dV, 


which shows that the values « = 3 and « = 4 refer to the same orienta- 
tions (in the negative and positive direction parallel to z) as the values 
a = 1 and a = 2 respectively. 

It should be mentioned that we get exactly the opposite result as to 
the meaning of « = 3 and a = 4 if, instead of the angular (mechanical) 
momentum, we consider the magnetic moment due to the spin p = py &. 
We get, namely, in this case [cf. (283 a)]: 

P= f Mav = pf (otter tells oty,) AV. 
This shows that in the states a = 3,4 the electron behaves, so far as 
its spin magnetic moment is concerned, as a particle with a positive 
charge. 


As has been explained already, the quadruplicity of the Dirac theory 
is connected with the introduction of states of negative energy «. The 


§ 34 NEGATIVE ENERGY STATES 345 
values « = 3,4 for a state of this type have the same physical meaning 
as the values a = 1,2 for the corresponding state of positive energy 
(the functions 3, ~, being large compared with y,, ¥, in the former 
case and small in the latter). The quadruplicity appearing in the com- 
parison of Schrédinger’s and Dirac’s theory can be pictured as the 
result of the reflection of a point representing a Schrodinger state in 
the plane « = 0 and further as the splitting of the two points into 
a Pauli doublet. 

To each characteristic value of Schridinger’s energy constant H’ 
there correspond in Dirac’s theory four energy values e«’ which can be 
denoted as follows: 


mc?+H’'!, m,c?-+ H’+ (> 0), 

m,c?-+ H';, mM, c?4- H’~ (< 0), 
the first pair lying close to each other as well as the second pair, the 
two pairs having approximately opposite values. 


The matrix elements of any physical quantity represented by the 
four-dimensional matrix-operator F’, as defined by the general formula 


Fee = [vi Phew = [X X¥taRaven dV 
a=1 B=) 
can be combined accordingly into four-dimensional matrices: 
Fy’! Pye nt Frye yy Bye: 
—— Fiyet nt Faye ys Fyyes yz 
Prez F, ard 2 hie Fyes wz 
Puen Fnezat Piper: Pern 
If the function Fy, is expanded in a series of functions Y,., according 


to the formula ; ‘ 
Fy = pa Frye Pers 


negative energy states must be taken into account as well as the 
states of positive energy unless the matrix elements F,-,,, where «’ > 0 
and e” > 0, all vanish. This circumstance is especially important in 
various perturbation problems; with / denoting the operator of the 
perturbation energy, correct results as to the probability of combined 
(double) transitions are obtained only if intermediate stutes of negative 
energy are considered along with those of positive energy. In the 
problem of the scattering of light by a free electron, for example, the 
relative importance of intermediate states of negative energy is larger 
the smaller the (positive) energy of the initial and final state. This 
3505.6 Y y 


346 WAVE MECHANICS OF A SINGLE ELECTRON § 34 
result (due to Tamm) is especially startling because relativity corrections 
vanish in the limiting case of small velocities, so that negative energy 
states which form a characteristic relativity effect would be expected 
to become insignificant in this limiting case. 

Another interesting example of the paradoxical role played in Dirac’s 
theory by the states of negative energy is presented by the motion of 
an electron through a potential energy jump, as discussed by O. Klein. 
For the sake of simplicity we shall take the equation of the second 
order, Dp = 0 (D = u?—uj+mic?), to which the four equations of the 
Dirac theory reduce for free motion. The continuity conditions for the 
four functions y,,..., 4, can be replaced in this case by the continuity 
condition for one of them and its derivative in the direction of the 
energy jump. Assuming the latter to take place in the direction of 
the z-axis, the potential energy being equal to 0 on the left of the 
plane z = 0 and U = const. > 0 on the right, and assuming further the 
electron to move parallel to the z-axis, we get 


x—el) + 7 (-9,2~«l) 


127 
p= Ae bP Ae 


for x < 0 (incident and reflected wave), and 


7 (9,2-«l) 


72 

y= Be* 

for x > 0 (transmitted wave), where 
g2 = «*/c2—myc* and gj = (e—U)?*/c?—myc?. 


The continuity conditions give the same relations A’+A” = B’ and 
A'—A” = B’g,/g, a8 in the non-relativity theory [cf. Part I]. The 
important difference between the latter and the present theory con- 
sists in the fact that the above relativity expression for g, remains real 
not only in the case when U is smaller than the kinetic energy of 
the incident electron «—m,c?, but also in the case when it is larger 
than m,c?+« = 2m,c? (if « is not very different from m,c?). This 
means that total reflection (g, imaginary) takes place only within the 


ronEe e—mc? < U <e4+mec’, 


whereas beyond it we get transmission both for small and for large 
values of U. 

It seems hardly possible to give a reasonable interpretation of this 
result. It can be shown, however, that the paradoxical transmission 
probability for the case U > e+m,c* rapidly decreases when the dis- 
continuity U in the potential energy at x = 0 isreplaced by a gradual 


§ 34 NEGATIVE ENERGY STATES 347 
increase within an interval comparable with or larger than the wave- 
length of the electron A = h/g. 

The physical meaning of the states of negative energy is at present 
not quite certain. They were initially interpreted by Dirac in con- 
nexion with the duplicity of electricity, and served to reduce protons to a 
mere absence of electrons if space is assumed to be nearly saturated with 
electrons in states of negative energy, with due regard to Pauli’s ex- 
clusion principle. It is, however, impossible to interpret in this way the 
difference in the mass of electrons and protons. According to Pauli and 
to Wey] the rest-mass of a proton considered as a hole in the distribu- 
tion of electrons with negative energies should be exactly equal to the 
rest-mass m, of an electron. 

Although Dirac’s original theory has thus failed to reduce protons to 
electrons, yet it may perhaps he credited with predicting the existence 
and properties of things that have hitherto never been anticipated by the 
experimental physicist and that seem to reveal themselves in the Wilson 
chamber cloud-tracks of particles released by the penetrating rays of 
cosmic origin and by very hard gamma rays. These are the ‘positive 
electrons’ whose discovery has recently been announced by Anderson 
(1932) and also by Blackett (1933). 

The experimental data are still too scarce to make it sure that positive 
electrons really exist. But if they do exist they fit beautifully in the 
scheme of Dirac’s theory. The fact that they are not found under 
ordinary conditions is explained by the extremely large probability that 
a ‘positive electron’ will recombine with a negative one (the latter 
falling from a state of positive energy into the hole constituting the 
former), this recombination being accompanied by the emission of two 
photons (cf. Part I, § 19). 

The visible existence of the material world around us must be 
guaranteed from this point of view by the fact that the total number of 
electrons is larger than the number of available states of negative energy, 
at least in that part of the world which is accessible to observation. 

Assuming the existence of positive electrons, it would be natural to 
postulate the existence of ‘negative protons’ formed by holes in a 
practically saturated distribution of protons between states of negative 
energy. 

It is difficult, however, to accept the idea that space is filled up with 
one or two sorts of particles forming a kind of infinitely dense ‘ether’ 
which is revealed in a negative way only through the occasional absence 
of the full quota of these particles. 


348 WAVE MECHANICS OF A SINGLE ELECTRON § 34 

Dirac’s equation has served as a starting-point for the introduction— 
besides positive electrons—of particles devoid of electrical charge and 
denoted accordingly as ‘neutrons’. Dirac himself attempted in 1931 to 
introduce neutrons as magnetic analogues of clectrons, i.e. as particles 
possessing a magnetic charge instead of an electric one. Pauli on the 
other hand proposed (simultaneously with Dirac) a theory of neutrons 
devoid of charge (both electric and magnetic) but possessing a magnetic 
moment and a spin angular momentum associated with it. The necessity, 
or rather plausibility, of introducing neutrons in addition to protons 
and electrons as constituent parts of atomic nuclei was dictated by 
certain nuclear phenomena, like the apparent failure of the alterna- 
tion principle (Bose-Einstein statistics holding for nuclei supposed to 
consist of an odd number of particles) and of the principle of conserva- 
tion of energy (continuous f-ray spectra of radioactive substances). 
These difficulties could be removed by admitting the existence in the 
nuclei of a third sort of elementary particles in a bound state. The idea 
of treating these particles as ‘magnetic neutrons’ was suggested by the 
possibility of replacing Dirac’s equation for the electron by a similar 
equation with e = 0 and with the mass m, increased by an additional 


term 
L = p(H§—E-») 


which represents the action of the magnetic and electric field on the 
neutron’s magnetic and electric moment (§ and y being the matrices 
(275b) and (275c), and » hypothetically Bohr’s magneton). Pauli’s 
equation for the neutron can thus be written in the usual form 
(e+ aa = 0 with 
€ = Cy'‘P+yo(mc?+ L), 


where p = =v ; the electromagnetic potentials A and ¢ do not appear 


in € since the electric charge with which they must be multiplied is sup- 
posed equal to zero. 

We shall not stop here to develop Pauli’s theory. The remarkable 
fact we are mainly concerned with is that the neutron was discovered ex- 
perimentally by Chadwick, following observations by Curie and Joliot, 
within a year after its existence had been tentatively admitted on theo- 
retical grounds. It made its appearance as the disintegration product of 
certain nuclei bombarded by protons or a-particles in the form of a 
particle with a mass very little different from that of a proton (while 
Pauli expected it to have a mass of the same order of magnitude as the 


§ 34 NEGATIVE ENERGY STATES 349 
electron). It is still a matter open to question whether a neutron is 
a simple particle like an electron and a proton, or a combination of 
both.t The latter alternative seems the more natura], although we are 
not yet in a state to substantiate it theoretically, for the present wave- 
mechanical theory is inadequate in treating such systems, whose linear 
dimensions are of the same order of magnitude as the ‘size’ of the 
electron (attributed to it on the electromagnetic theory of mass). As 
to the forces binding the electron and proton in a neutron more 
tightly than in a hydrogen atom—they may be due to the mutual 
attraction of the spin magnetic moments. In fact this attraction (which 
corresponds to a suitable orientation of the spins) increases with de- 
crease of distance much more rapidly than the attraction due to the 
electric charges of the two particles, so that the Coulomb attraction 
becomes negligibly small (relatively) at distances of the order of 
10-44 cm. It cannot be asserted, however, that the usual inverse fourth- 
power law for the mutual attraction of two elementary magnets is 
applicable for distances comparable with the electron’s own dimensions. 


35. The Invariance of the Dirac Equation with regard to Co- 
ordinate Transformations 
We have hitherto considered the Dirac equation of motion for a parti- 
cular frame of reference specified by the coordinates z, y,z and the time ¢. 
We shall now investigate the transformation properties of this equation 
for such transformations as correspond to a rotation of the coordinate 
system x,y, z in space, or more generally to a Lorentz transformation of 
the coordinates and the time (i.c. to a rotation of the original frame 
in a four-dimensional space-time manifold). 
We shall first write down the Dirac equation in the form of two 
two-dimensional matrix equations 
o-uy-+(u,—mc)y = 0 
GUy+(U,+tmoc)p == 0 
[cf. (257a), § 30] and limit ourselves to rotations in ordinary space, 
which do not affect the operator u, The invariance of equations (302) 
with regard to such rotations can be achieved in two different ways: 


(302) 


(1) By considering the wave functions (matrices) ¢ = Me and 
‘ 2 


x= fe as invariant and the matrices o,, ¢,, o, a8 covariant, i.e. 
Xe 


+ It might also be surmised that the proton is a complicated particle formed by the 
combination of a neutron with a positive electron. 


350 WAVE MECHANICS OF A SINGLE ELECTRON § 35 
transforming according to the same law as the coordinates z, y, z. Under 
this condition the product o-u = o,u,+0,u,-+0,u, will define a scalar 
(invariant) operator. 

(2) By considering the matrices o,, o,, o, as invariant numerical 
operators, and introducing a suitable transformation for the matrices 
bx: 

The two methods must, of course, give equivalent results. In the 
first case we can define the matrix o, for any direction n (which may 
be that of one of the new coordinate axes) as the projection of the 
vector o in this direction. Using the polar angles 0,,¢,, to specify it 
with respect to the original coordinate system C(x, y,z), we have 

o, = 0, Cos(x,n)+ a, cos(y, n)+- 0, c0s(z, 2) 
= sin 6,(¢,cos¢,+¢0, s8in¢,)-+o,cos6,,, 


which is equivalent to four equations for the matrix elements ¢,,.g 


(a,B8 = 1,2) of o,. With the help of the expressions o, = ia 
01 —10 : 
oy = > of a= | i HI defining the rectangular components of o 
in the system A, we get . 
_ { —cos6, sin@,, e*? 5 
— ige cos 6,, . saa, 


This equation can be applied for the definition of the matrices c,., o,, 
o,, which represent the rectangular components of the vector o with 
regard to a new coordinate system C’(zx’, y’,z’). 

We shall not, however, write down the explicit expressions for these 
matrices (which can easily be found with the help of the three Eulerian 
angles), but shall limit ourselves to presenting the general transforma- 
tion equation in the form 


3 
OnoB = 2 Amn Smap> (302 b) 


where the indices (m,n) = 1,2,3 stand for the three axes of the old 
and the new system respectively (0; = c,,, etc.), while a,,,, is the matrix 
of the orthogonal transformation C > C’: 
t= > ann Um: 
It should be emphasized that the indices m, n which specify the 
coordinate axes or the rectangular components of o, have nothing to do 
with the indices a, B which specify the matrix elemenis of o or of its 


rectangular components. 
The transition from the first method (of transforming o,,) to the 


§ 35 INVARIANCE OF DIRAC EQUATION 351 
second method (of transforming ¥, and y,) can be carried out in the 
following way: 

We try to find a unitary two-dimensional matrix A such that the 
transformation defined by (302b) shall be equivalent to the following 


sis of = A-9,A (A-?= At), (303) 


that is, - 
’ A= A* Asga. = 1,2,3 
Fnap 2, Py ya 4288 Fnyd (n ) 


involving a component of o along a given new axis and along the 
corresponding axis only of the original coordinate system. 

The relation between the transformation (302b) and (303) can be 
stated as follows: in the former the matrices o,, (or o,) appear as com- 
ponents of a vector in ordinary three-dimensional space, whereas in the 
second case they appear as tensors in the two-dimensional spin-space 
specified by the Greek indices a, B, etc. The transformation matrices 
a,,, and A,g are both unitary and refer respectively to the ordinary 
space and to the state-space. 

Let us suppose that we have succeeded in finding A and let us write 
the scalar product o-u in the form 


Soi = ot = 5 Ao A = AMO oc 8A. 


(A commutes with wu, since the latter is a scalar in the state-space.) 
The transformed equations (302) can be written accordingly in the form 


A+( So, u,)Ag-+ (mj —mgc)x = 0, 
A-"(> o, U,)JAx+(uj+myc)p = 0. 
Multiplying them on the left by the matrix A, we get 
(X on ma)o’ + (uy —myc)x’ = 0 
(Son mn)x’ + (utmacyp’ = 0 


with the operator-matrix ou’ of the same form as in the original 
coordinate system and with the transformed wave functions 
yp’ = Ay, x’ = Ax. (303 b) 
We shall determine the transformation matrix A for the simple case 
of a rotation in the (z,y)-plane through a given angle ¢ (in the direc- 
tion from zx to y). This gives 


(303 a) 


’ 


x’ = xscosd+ysing, y’ = —zsing+ycos¢q, oe 


352 WAVE MECHANICS OF A SINGLE ELECTRON § 35 
and consequently 

oy = 0,cosd+o,singd, o, = —o,sin¢+o,cos¢, oa; =a,, (304) 
that is, 


,_f 0. ef¢ , _{ 0 ted , {-1 0 
Oe lento of OU | ae-t® oo PF 1 0 of 


Now we must have, irrespective of the index n, 
o, A = Ao;, 
and in particular for n = 3, o,A = Ao,, that is, since o, is a diagonal 


trix, 
ici (0,40 — 9268) 4 ap = 0, 


whence it follows that A myst also be a diagonal matrix. Putting 


_ {4:1 ° 
A= | 0 Pur we get further 


ba 0 A, = 0 A,e# = : 
“t= (4, 7 aot 0 ates 
that is, A, = A,e'?, A, = A,e-‘#, 


or consequently A, = ce-‘i#, A, = cet‘i?, The same result is obtained 
from the equation o, A = Aa). The constant c is determined by the 
condition that the determinant of A (a unitary matrix) is equal to 1. 
We thus get c = 1 and finally 


e-ii¢ 0 
A = 0 etti?d 


= cos }¢+2¢, sin }¢, (304 a) 
(the first term being understood to be multiplied by the unit matrix 8) 
which corresponds to the following transformed expressions for the 
functions y, x: 
Y= Pye, py = Pyert; x) = x e-F¥, xp = x2er4F. (304) 
For a rotation in the plane z,z through the angle 6 (in the direction 
from z to 2), i.e. for the transformation 
oy = 0,c0s8—o,8in8, a, = oy, o, = 0,8iIn8+0,cos8, (305) 
on ro coed cos 8 ‘fae 0 F cao cael 
cos@ —sind)’ ” . sin@ cos 6}’ 
we get in a similar way 
Ay, = Aga, A,, = —Ay, 
(from the equation o,A = Ao,), and further, from o,A = Ao; or 
o,A = Ao;, together with the condition |A| = 1: 
Aves ce 36 —sin $6 


in 26 conte) a (3052) 


§ 35 INVARIANCE OF DIRAC EQUATION 353 
whence 

Yi = $100830—Yysin}O, —, = yy sin }0-+y,.008 40, 

xX; = x, 00s 30— x, sin 38, xX: = x,8in 30+ x, cos $0. 
It should be mentioned that the transformation matrices (304 a) and 
(305a) can be written in the form e‘##% and et!» respectively. We 
have in fact, by the definition of the exponential function 

eit 3 
eH = 1-5 + —-}+ie,(n— H+.) 
= cospu +20, sin p, 

since oo = rl = ... == 3( == 1), o% = oe = ... = Op. 
With » = jd anda, = a, this gives (304a); with» = 40 anda, = g,, it 
gives (305 a). 

Two successive rotations are obviously equivalent to a single one, 
specified by a matrix (a” or A”) which is equal to the product of the 
matrices (a,a’ or A, A’) specifying the two component rotations. Thus, 
for example, by combining the two preceding rotations in the order 
stated, we get a rotation with the transformation matrix (in the state- 
space): 

» _ {cos}0 —sin}6\fe-8% 0) — {cos3@c-4¢ —sin 6 cil?) 
*r 16 cos 40 Hl 0 ch a en Led =cos {eile f’ 
which can be written symbolically in the form 
A” == etldos pilOoy —. git(do:+Goy,) 
with the understanding that the order of the two factors should not 
be inverted. 
This means that to a coordinate transformation defined by the equations 
a” =. (xcosd+ysin d)cos #—z sin 0 
y” = —xsind+ycos¢ 
z” = xsin@+<zcos 0 
there corresponds the following transformation of the functions yw: 
pr = pcos i0e-4—Y sin JOctl>, of} == oh, sin JO e-4¥ +h, cos JO e'lF, 
and a similar transformation of x, xo. 

The preceding results are casily generalized for any number of suc- 
cessive rotations about arbitrarily chosen axes. These rotations are 
always equivalent to a single rotation over an angle w about an axis 
specified by a unit vector n. The transformation matrix A correspond- 
ing to such a rotation is easily seen to be 


A = cos $w+io, sin Sw = ef), (306) 
3595.6 AA 


354 WAVE MECHANICS OF A SINGLE ELECTRON § 35 
where o, = on = n,0,+n,0,+7,0, is the component of o along the 
axis of rotation. The reciprocal matrix 
A-! = cos }w—ia,, sin 4w 

corresponds to a rotation about the same axis in the opposite direction 
(or to a rotation about the oppositely directed axis —n through the 
same angle); it obviously coincides with A‘ since ot = o. Hence it 
follows that A is a unitary matrix, as was assumed at the beginning. 

A two-dimensional unitary matrix can be represented with the help 
of two complex numbers a, f satisfying the condition a«*-+ 88* = 1 in 


the form a={ ‘is al 


In the present case these numbers are 

a = cos4w+in, sin dw, B = 1(n,+%n,)sin Sw. 
It should be mentioned that the number of real independent parameters 
which determine the rotation is equal to three (the rotation angle w and 
the two angles 0, ¢ which determine the direction of the axis of rotation 
n, or three of the four real numbers which define «a and 8 under the 
condition aa*-+ fB* = 1). 

As has been shown in § 30, the probability density and the rectangular 
components of the probability current density are expressed, with the 
help of the two-dimensional matrices , x, 7, by the equations 

p=Optxtx, jn = cb onxtx' ony), 
[n = 1, 2,3; cf. eqs. (259) and (259a)]. Transforming the functions % and 
x according to the equations y’ = Ay, 4’! = At, and regarding the 
matrices g,, a8 invariant, we obtain for the same quantities referring to 
the rotated system the expressions 

p' = PiAlAp+y'Alay = py+x'x = p 
(since At = A-), and 

jn = o[p'(Ato, A)x+x"(Alon Ap] = c(p'o, x+x'o,¥), 
= 2 Gandm 
™ 

in agreement with the invariant character of p and the covariant 
character of the components of the vector j. 

The preceding results are easily extended to the four-dimensional 
matrix form of the Dirac equation and of the associated operators. 
Taking, for example, the energy operator 


= 
= U+egyote 2 Yn Uns 
n= 


§ 35 INVARIANCE OF DIRAC EQUATION 355 
we can consider it as an invariant with regard to rotations in ordinary 
space if the three four-dimensional matrices y, = y,, Yo = Vy) Y3 = Ye. 
are defined as covariant operators, satisfying the same transformation 
equations as the coordinates 7, = 2, X, = y, X, = z or the components 
of the operator u. The shape of the transformed matrices y,, is easily 
obtained from the above expressions for the transformed matrices o;, 


with the help of the invariant relations y, = " ¥: 
Co 


a 
at 
The same relations can serve for the determination of the unitary 
matrices, Z say, which determine the equivalent transformation in the 
four-dimensional spin-space according to the ‘tensor’ law 


Yn = Ly, L = L'y, L (n = iF 2, 3). 


n 


We have, namely, L= * : ; (306 a) 
Ay, Ayo . . . . . . 
where A =: “} is the two-dimensional unitary matrix defining 
21 4499 
the transformation of o,, (0 = ‘i ol} 


With the help of the matrix § = i; ;| which serves to describe 


the electron’s spin or magnetic moment [cf. (277)] we can write the 
matrix Z corresponding to a given rotation (w,n) explicitly in the form 
L == ett = cos 4w+7€, sin dw, (306 b) 
similar to (306) with o replaced by &. 
The matrix y) remains invariant under this transformation. Writing 
Dirac’s equation in the form (e+ ,)~ = 0 and using equation (306) for 
the y,,, we can write it for the rotated coordinate system in the form 


3 
[et U+eorye) +e X Lyn Lun] = 0, 
or since (p,+U +979) = L(pt+U +eoy)L, 
3 
Ll p+ U+egyote 2 Yn u;,| Lip = 0. 


If this equation is multiplied on the left by Z, it reduces to the original 
form, with the old matrices y,, the new components of u, and a new 
wave function ~’ derived from the old one by means of the trans- 
formation ss To, 


Putting y = I, where ¢ = ) and x = (4, we get, with the help 
x 2 X2 


356 WAVE MECHANICS OF A SINGLE ELECTRON § 35 


of (306), $' A ¢ 
| | = (a 
in agreement with the results obtained before. 

It can further be shown directly that under the transformation yp’ = Ly 
and y’t = tL the product 4‘y remains invariant while the quantities 
coy, % transform as the rectangular components of a vector. 

We can now turn to the generalization of the preceding results for 
rotations in the four-dimensional space-time manifold of the relativity 
theory, i.e. for Lorentz transformations, corresponding to a transition 
from a state of ‘rest’ to that of uniform motion. 

It will be convenient in this connexion to use Dirac’s equation in 
the form By = 0, i.e. 


(8, u,+By Uy +B, Ue +B, Uy+ Mo cys = 0, 
or (5 Batta-tmye)h = 0, (307) 
n=1 
where n = 1, 2,3 stands for z, y, z respectively, while 
a=V—let, m=—V—lu, B= v—If, 

It must be emphasized that the imaginary unit /—1 is introduced here 
simply for the sake of formal symmetry, and that it will be treated in 
the sequel as an ordinary ‘real’ number, in the sense that its sign will. 
not be altered in a transition to conjugate complex quantities. In order 
to distinguish this relativistic /—1 from that of the quantum theory, 
which plays an essential role, we shall denote the relativistic V—1 by 
the Greek letter «¢ («* = 1, 1* = —+1). 

A Lorentz transformation is defined as a linear transformation of 
the form 4 

7 = 2, a 


satisfying the orthogonality condition >) ox Py x? and the condition 


n=l 
that the first three components of 2’ should | fe real and the fourth 
imaginary (reality condition). The components of the four-dimensional 
operator u are transformed in the same way as the corresponding 
coordinates, and if we wish to ensure the invariance of equation (307), 
we must either submit the matrices 8, to the same Lorentz trans- 


formation ‘ F 
re 2 Onn Bm ( ny 2 inn Rees) (307 a) 
m=1 m=1 


or introduce the equivalent tensor transformation in the four-dimen- 


§ 35 INVARIANCE OF DIRAC EQUATION 357 
sional state-space 

Bo = K'B,K (Bry = DD Ki Ky Briea): (307 b) 
With the help of the latter the transformed Dirac equation can be put 
in the form 


4 
K'( DBn u;, +m, e\Ky =O, 


4 
that is, ( > Bat, +9 cy = 0, 
n= 


with the same numerical matrices 8, as the original ones and with the 
transformed wave function 
b= Kp. (307 c) 

The possibility of replacing (307a) by (307 b) is proved by the fact 
that the transformation matrices a,,,, (in the ordinary space-time) and 
K,, (in the four-dimensional state-space) have the same rank. They 
contain therefore the same number of elements. 

The determination of K through a can be carried out in the same 
way as in the case of rotations in ordinary space, by combining rota- 
tions in different planes. 

In the case of rotations in ordinary space the matrix K must ob- 
viously coincide with the matrix L considered before. This follows from 
the relations 8, = yoy, for n = 1, 2, 3 (8, = eyo) in conjunction with 
the fact that y, is not affected by a spatial rotation. Now for a rotation 
through an angle w in the plane (2,, 7.) we have, as has been shown above, 
L = eis or, since £4 = 18; 8. [according to (276), § 31], L = e-tA:bs, 
Identifying this with the matrix K for the case under consideration 
and taking into account the relativistic symmetry of Dirac’s equation 
in the form (307) with respect to the space coordinates and the time 
(ect), we can define the matrix K corresponding to a transition from 
a state of rest to that of a motion in the direction of the first axis with 
a velocity v by the expression 

K= e-t9BB 


corresponding to a rotation in the plane (z,,27,) through the imaginary 
angle # = tan-'v/ic. Replacing here B, by yo7,, By by «yo, and putting 
? = .6, where 
tanh@ = ° i a ay th Pad 
@=° (cosh o = Jaca tinh O = a) 
we get, since ¥ov¥1¥%o0 = —¥"1 a eas 
K = e-*6, 


358 WAVE MECHANICS OF A SINGLE ELECTRON § 35 

This result is easily generalized for the case of motion in any direction 

specified by the unit vector n’. Denoting the corresponding component 
of y (i.e. the scalar product y-n’) by y,, we get 

K = e-*9m — cosh 40—y,,-sinh 40. (308) 

In order to find the corresponding expression for the matrix L we 

must come back to that form of Dirac’s equation which has been used 


4 
hitherto, viz. (> Yn Untvo mM c)yp = Owithy, = dand u, = —:u,, where 
1 


the factor « is introduced in order to secure a more complete symmetry 
between the terms involving the space coordinates and the time. The 
Lorentz transformation of the components of the operator u, defined 


4 
by the equations u;, = > a,,,,u,,, must be combined with an appropriate 
m=1 
transformation of the wave function, 4’ = Ly, so that the transformed 
4 
equation shall reduce to the form (> Yn Unto Mo c)yp" == 0 with the same 
1 


matrices y,, (including y,) as the original one. Replacing % and w’ by 
Yow = d and yo’ = yf’ respectively, we come back to the equations 


(SBn%ntmocyy =O and (5B, u,-+moc)p’ = 0; 
whence it follows that L = y,1Kyo, where A is the transformation 
considered before. Since yZ = 1, i.e. yp = yo!, we can put L = yy Kyp. 

Substituting here the expression (308) for K, we get 

L = yo cosh 40—yo y,"79 sinh $6 
or, since y, = 1 and yoy¥n¥o = —YoYn = —Yn's 

L = cosh }6-+y,,sinh 40 = et’, (308 a) 
If y is replaced here by in, where n is the matrix which serves to define 
the electron’s electric moment in the same way as & defines the magnetic 
one [cf. (276a)], L assumes a form quite similar to that (306a) which 
corresponds to an ordinary spatial rotation. It should be remembered, 
however, that while & represents a real quantity, y must be considered 
as a pure imaginary. This corresponds to an important distinction 
between the matrices (306b) and (308a), the former being unitary 
(Zt == L-! defining a rotation in the opposite sense) and the latter 
Hermitian (Zt = L). 

In the general case of a Lorentz transformation combining an 
ordinary rotation (w,n) with a relative motion (8,n’), the matrix Z can 
be represented as the product of the two component transformations 
taken in a definite order, for instance, 

L = ettvbnetOyn, (308 b) 


§ 35 INVARIANCE OF DIRAC EQUATION 359 
The adjoint matrix is 
Lt = eb yne-tivk sn 
so that 
L'L = cosh? $6+ sinh? 36+ 2y,,,sinh 36 cosh 36 = cosh 0+, sinh 0. 


Substituting this expression in the formula p’ = ¥'L' Ly for the trans- 
formed value of the probability density (in the ‘moving’ coordinate 
system), we get 

p’ = o'pcosh 0+y'y, sinh 6, 


. —_ Pp ans v/ Cc 
‘sinh @ = ave) 
in agreement with the well-known result following directly from the 
Lorentz transformation equations. 

If the moving axes are parallel to the original ones (w = 0) we get 
in a similar way from the general formula j, = 'ty, ’ = ¥'L'y,, Lib 


jw = ¥'[y,-(cosh? 36+ sinh? $6)+ 2 cosh 46 sinh 46)}y, 


that is, 
p’ = pcosh8+y,, 


that is, In+prfe 
aoe) 


It should be mentioned that instead of introducing the relativistic 


imaginary . = /—1 in the definition of the fourth component of four- 
dimensional! vectors one can distinguish two types of real components, 
namely, the covariant and the contravariant, the latter differing from 
the former by the opposite sign of the fourth components. The contra- 
variant components are denoted by the same letters as the covariant 
ones with the index placed above instead of below. If, for instance, 
Ly = @, Ly = Y, XT; = 2, T = ct are the covariant components of the 
space-time vector, then its contravariant components must be defined by 
YD = a, cf) = y, x) = z, 2) = —ct. The square of a four-dimensional 
vector, A say, is thus equal to the sum of the products of its covariant 
components with the corresponding contravariant ones: 


Jn = Jn: c0sh 6+-psinh é = 


At = > A, At, 


nal 


In a similar way the scalar product of two vectors is defined by the 
sum 3A Pe or} A*By. With this notation Dirac’s equation can be 
written in the form 

[3 y®uy-+yome]y = 0, 


360 WAVE MECHANICS OF A SINGLE ELECTRON § 35 
where u, = u, = (se: 5 +e) and y = § (= 1). The covariant com- 
ponents of the four-dimensional velocity vector y must be defined 
accordingly as 
MY Ye=%y Ve Ye Yat -- 8 (== — 1), 
and the covariant components of the operator wu as 
au” = w., u = w,, u® = u,, uf = —wy, 

The transformation matrix LZ obtained above thus refers to the 
contravariant components of y. It is easily seen, however, that it can 
be applied just as well to the covariant ones. 

Quantities of the type of Dirac’s wave function quadruplet ¥,, y., 
Ws, Y, can be regarded as forming in the space-time manifold a kind 
of tensor of rank 4. This means that they are related to an ordinary 
vector (i.e. tensor of the first rank) in the same way as the latter is 
related to an ordinary tensor (of the second rank). This connexion is 
plainly seen from the fact that an ordinary vector—like the probability 
current density (j,p)—can be expressed with the help of the #’s as a 
quadratic quantity—just as a tensor (of the second rank) can be 
expressed as a quadratic quantity by means of the components of a 
vector or of two different vectors. 

It has recently been shown by various authors{ that each of the two 
pairs of functions y,, f. (= $,,¢.) and ws, by, (= x1, X2) rather than the 
whole quadruplet determines a ‘tensor of the rank 3’. Any pair of such 
quantities, whose transformation properties in the state-space of the 
spin coordinate (with its two values 1 and 2) are connected with the 
transformation properties of vectors in the ordinary space-time mani- 
fold by the above equations, are called, following Ehrenfest, a spinor. 
The two components of a spinor, ¢, and ¢, say, are complex numbers; 
they determine therefore four real numbers which can serve to specify 
the components of an ordinary four-dimensional vector. A vector can 
be defined as a particular type of spinor of the second rank, i.e. as 
a quantity whose components (in the spin space!) transform like the 
products of the components of two ordinary spinors, or in particular 
of a single spinor ¢ and its adjoint quantity ¢'. 

It can easily be shown, for example, that the expressions 

ic a plo.g (k aes 1, 2, 3, 4), 
where 
Oj = Oy, 0, = Oy, Os = 0, o=8 (=), 


¢ Cf. O. Laporte and G. Uhlenbeck, Phys. Rev. 37 (1931), 1380. 


§ 35 INVARIANCE OF DIRAC EQUATION 36) 
that is, fy = $f¢2+ $261, fe = Ubi b.—G24;), fo = —Olb.+¢3¢2, and 
1 = $74¢,+4%¢, transform like the quantities z, y, z, ct in any ordinary 
rotation or in a Lorentz transformation, if ¢ is transformed according 
to ¢’ == Ad and ¢! according to ¢’t = A'd', A being a two-dimensional 
matrix, which reduces to the form e/!#% = cos}w+o,,sin }w already 
considered in the case of an ordinary rotation (through the angle w 
about an axisn). In the case of a relative motion in a direction »’ with 
a velocity v specified by the angle @ = tanh-1(v/c) we have, so long as 
the new axes are parallel to the old ones, 
A == en — cosh $6+0,,sinh }6 (o,,. == 2, 0,+N),0,-4-2;0,). 


This gives in particular, for a motion in the z-direction, 


ei8 0 
A = : 
( 0 a0) 


In the most general case A can be represented by the product of eo 
with e!9, that is, 
A = cos }wcosh }0+<¢, sin }wcosh }6-+-¢,-sinh $6 cos 4w+ 
-+-o,,¢,:sin }w sinh 38, 
or, since 0,0, = (o-n)(o-n’) = n-n’+710-(nxn’), 
A = cos }wcosh }§+n-n'+-o, sin }w cosh $6-+-¢,,,sinh }8cos }w+ 
+2%0-(n xn’)sin $w sinh 36. 
The elements of this matrix are easily verified to satisfy the relation 
|A| = Ay, Ax»—Ay. Aq = 1. 
Using the notation 
i.e. replacing the conjugate complex sign by dotted indices, one can 
write the covariant components of a spinor of the second rank in three 
different forms, namcly, 
Pra Pia» ii " (k,l = 1,2), 
these components transforming as the products ¢, ¢,, $9, and ¢j.¢; 
respectively. 
Besides covariant components of spinors we must also distinguish 
contravariant ones. For a spinor of the first rank these are defined by 
the relations fil) = ge, gi) = —¢,, 


dg) = 4;, ¢%= —¢;, 
because this ensures the invariance of the ‘scalar products’ ¢, x + ¢, x 


and ¢; x0+-43 x. 


3595.6 3A 


362 WAVE MECHANICS OF A SINGLE ELECTRON § 35 
The contravariant or mixed components of spinors of higher rank 
are connected with the covariant ones in a similar way. We have, for 
example, 
gp"! = doo, $1? = —day, $7! = —dy2, $* = $y, eto. 
The components of a (four-dimensional) vector f can be represented 
with the help of a spinor of the second rank by the formulae 


P=h=Wbitd = P= f= Zlbu-di, 


P=h=s(-bitdns f= f= Mbintdi. 


We shall not engage in a more detailed discussion of this question 
and shall point out in conclusion the following important circumstance. 
In our derivation of Dirac’s equations as a generalization of the 
equations of Maxwell’s theory we originally introduced, instead of 
the quadruplet ¥,,%2,%3, 4,4, eight quantities M,, M,, M;, My; Nj, N2, 
N;, No, visualizing the six quantities M,, M,, M,, —N,, —N,, —N; 
as analogous to the electromagnetic field components H,, H,, H,, 
E,, E,, E,, while My and N, were regarded as two additional scalar 
quantities. This point of view had to be abandoned in the sequel 
because of the rearrangement of the Maxwell-like equations, corre- 
sponding to the introduction of the additional terms containing the 
rest-mass of the electron m,. If, however, instead of the first-order 
equations we consider the second-order equations only (which are 
a generalization of the d’Alembert equations of the electromagnetic 
theory), we can preserve the above point of view and treat the quantities 
M,, M., M;, iN,, 1N,, iN, as the components of a four-dimensional 
antisymmetric tensor of the second rank M,, = —M,, (k,l = 1, 2,3, 4) 
transforming under a Lorentz transformation in the same way as the 
components of the electromagnetic field-tensor F,, = —F,, (Fis = H,, 
Fy, = He, Fy. = Ay, Fy = —1E,, Fy = —iE,, Fy, = —1E,). It has 
been shown further that in this case we can put N = +71M which 
corresponds to the ‘self-duality’ of the tensor ¥,, and introduce accord- 
ingly the relation VN, = +7M, between the scalars (invariants) M, and 
N,, thus reducing the eight quantities M, N to four, just as in the case 
of Dirac’s equation. 

The fallacy of this procedure is shown by the fact that it does not 
permit us to define a four-dimensional vector representing the probability 
current j and density p. The latter would appear in such a theory not 
as the fourth (time) component of a vector but as the (4, 4)-component 


§ 35 INVARIANCE OF DIRAC EQUATION 363 
of a tensor of the second rank, corresponding to the tensor of the electro- 
magnetic energy and momentum; the components of the vector j 
would appear likewise as the (1,4), (2,4), and (3,4) components of 
this tensor, corresponding to the components of the energy-stream. So 
long as we confine ourselves to ordinary rotations in three-dimensional 
space this circumstance remains irrelevant; it becomes, however, a 
challenge to the theory when we pass to the more general Lorentz 
transformation, involving the transformation of the time. In order to 
make j,p a regular four-dimensional vector we must consider the 
quantities M, N as defining a spinor ~—or more exactly two spinors 
¢, x—whose transformation properties have been studied in this section. 

The above argument serves to show in a most convincing way the 
restricted character of the analogy between matter and light as repre- 
sented by the probability and the electromagnetic waves respectively. 
A ‘wave-mechanical’ theory of light similar to that of matt>r would 
necessitate the introduction of a new type of probability fv ld, con- 
nected with the photons in the same way as with ordinary particles 
and entirely different from the electromagnetic field which has been 
used hitherto to describe the phenomena. of light from the point of view 
of the wave theory. It does not seem, however, that the introduction 
of such a probability ficld with spinor properties is warranted by the 
experimental facts. 


36. Transformation of the Dirac Equation to Curvilinear Co- 
ordinates 


We have considered hitherto cartesian coordinates only. We shall now 
generalize the results obtained for a transformation from the cartesian 
system x, y,z to any system of orthogonal curvilinear coordinates q,, Ys, 73- 
Such a system can be specified by the following expression for the 
square of the line-element (i.c. the distance between two neighbouring 


peut dst = ej dgj-+et dgi-+ej agi, 

where @,, @o, €, are mutually perpendicular vectors tangential to the 
coordinate lines which pass through one end of ds. The products e; dq; 
play the same role as the differentials dz; of a local cartesian system 
passing through P with its axes parallel to the vectors e;, so that the 
rectangular components of the operator p can be written in the form 
h 1 a 


Dri e}. oO"). 


Pr = 


364 WAVE MECHANICS OF A SINGLE ELECTRON § 36 


3 
In transforming the expression y:py = (> Ye Pu) to the new co- 
1 


ordinates, with the help of the formula y, = L-'y, I we must take into 
account the fact that the matrix L is to be considered as a function of 
the coordinates, varying from point to point with the direction of the 
local cartesian axes. We thus get 


3 
y PY = LAS vn Lj) 
= LD yp Lb — > ve( 9, L—Lp,)$}, 
3 
or y'Ppy = E7 > vel Pi— (Bi L— Lpy) LYE. 


In order to obtain the transformed equations we must accordingly 
replace the components of the vector p by the ‘covariant’ operators 


Pi, = py— (py, L—Lp;,)L. (309) 
In the special case of orthogonal coordinates they assume the form 
h 1/2 7] 
= — —[ — -— — 309 
‘Qa a 9x 3 1) on) 
where & log L = Ob 7-4, 
OF; OF, 


Now, as has been shown above, the matrix L can be defined by the 
expression e'#§", where the rotation angle w and the axis of rotation n 
must be considered as certain functions of the coordinates. We thus 
have log L = ifwk-n = }iE-w, (309 b) 
the vector w serving to determine the rotation both with respect to 
magnitude and direction. 

Let us consider the infinitesimal rotation dw corresponding to a 
transition from a point P (with the coordinates g,) to a neighbouring 
point P’(with the coordinates g, = 9,+4d9;,). 

Introducing three unit vectors f, = e;,/e, in the direction of the 
coordinate lines, we can obviously put 

df, = dw xf, 
whence 
f,-df, = f; (dw x f,) = dw (f, xf;) = +(dw)-f; = +(dw),, 
where f, is the unit vector perpendicular to f, and f;, the positive sign 


corresponding to an even character of the permutation 5 ) and the 


negative sign to an odd one. 


§ 36 DIRAC EQUATION IN CURVILINEAR COORDINATES 365 
We have on the other hand 


f,-df, = (e,/e,)-d(e,/e,) = e,-de,/(e;e,), 
since the vectors e, and e, are mutually orthogonal (i # k), whence, 


putting for the sake of brevity 94. == Dp, 
h 


E (On); = @,°0, €y/(€; &,). (310) 
It follows from the formula 

dr = e,dq,+e,dq,+€3 qs, 
which can serve for the definition of the vectors e;, that the latter are 


equal to the differential coefficients of the radius vector r of the 
point (9),92,93) with respect to the corresponding coordinates. We 


thus have er ee a oer 
am tes (555) 
and consequently 
€,0,€, = €0,€; = $0,(€e,;) = 30,€7 = e 0, e;: 
Further, since e,-e, = 0 (k # 2), 
€,°0,e; = —e,0;e, = —e,0,€,, 
and if h is different from both k and i, 
€,'0,e; = €,'0,e, = 0. 
The latter equation is easily obtained in conjunction with the fact that 


o,(e, e;) = 0. 
Substituting these expressions in (310) we find 


| 1 
(2,0), = 0, (2,0), = a 031, (0,0); = ae Og ey 
FY 2 
1 1 
(@,0), = es sez, (82), = 0, (0,0)s = e, 0, (310a) 
3 
] | 
(0,0), = — 23g, (2,0), = —— 63, (2,u)3 = 0. 
eg q 


Now according to (309a) and (309 b) 


3 S h(l a@\ bh S 1 a 
FP: = sale \-z _ ( o), 
2 Yeo k = 2m e, aq, 4n tel e,7* 5 "x 
that is, 


Poayp el ¥ % 
yP’=yp ae ( > Fae): 


366 WAVE MECHANICS OF A SINGLE ELECTRON § 36 
The first component of the vector > vt 2, (i.e. its product with 
k 
f,) is 
] ] 
= eh tre, “(@ato +145) ea —Y2-- 05 10g €s +¥3-- 0, log es 
3 2 
beeane to (310 i Multiplying the right-hand side by ¢, we get, since 
£1 V2 = pbibe = tpfs = tys and fy, y3 = pb, £3 = —pb, = —ya, 
‘ 1 @ 1 a 
—1| y3— —loge,+y.— — lo es} 
Ee aq, Beery &, agg 8 &s 
We thus find 


] 
BD Yea OW 


[| 12 1 @ 1 a 
= —1i|y,—-- —log(e,e — — log(e,e,)-l-v3— —log(e, es) |, 
[rc aq, Bes Tee a, BPs e))-1 Ye oy a »| 


and consequently 
3 


, ’ €2€ ‘ 
= > Vk |Pé- Pi: log( “18 alt (311) 
ek 


k=1 
it being understood that the second term in the brackets represents an 
ordinary number and not an operator. We can also write 


3 
P= 1 1—I #1 2 €3) 41-1] I 
y'P >, nil og, /| ey )pi | (311 a) 


the transformed Dirac equation being 
[prted-+ey-(Pr—5A)+yymoct ly 26; (311b) 


Two special cases should be especially noted, namely, that of a 
cylindrical and of a spherical coordinate system. In the former case 
we have, putting 

9, = 17 = V(z?+y"), de = > (angle), and g, = z, 
e, = 1, & =F, é, = 1, 
and consequently 


r 2 F) 
yP’= ail (5 — 508) +722 gins — Flog v7) 


that is 
, , bt fa la. 2a 
yP’ = saln(s— sz) tr 3 ap 1 +z (312) 


§ 36 DIRAC EQUATION IN CURVILINEAR COORDINATES 367 
In the latter case, putting 


hh = 1 = V(2?+y*+2%), q2 = 6 (colatitude), and g, = ¢, 
we have é, = 1, &=Y%, és = rsin 6, 


and consequently 


y P= s]o( 5 — Floatrasin )}) +747 55 — slow vecin®)) + 


1 7) 
a ae glee v*)], 
that is, 


jg ne 1/20 7 
P= ln(s—2)+n2(5— loot) +n]. (3128) 


This expression can be used to reduce to its simplest form the problem 
of the hydrogen atom, which has been discussed already by a less 
straightforward method in § 33. 

It should be mentioned that in calculating the product y-A = > y, A; 
the quantities A, must be understood to represent the components of 
the vector potential along the axes of the local cartesian systems, i.e. 
along the vectors e,, @2, €; (A, = A-f,). The matrices y,, y2, v3, though 
identical with the original matrices y,, y,, y,, have now a different 
physical meaning, denoting the components of the vector y along the 
axes of the local system and not of the original cartesian system of 
coordinates. 

The preceding results can be further generalized for the case of a non- 
orthogonal system of curvilinear coordinates. We must distinguish in 
this case contravariant and covariant components of different vectors, 
the former ata as dq,, dg, dg, and the latter as 2/éq,, 0/0g,, 


d/dq,. Putting p, = ” and denoting the contravariant components 


ra i Oy 
of the vector y in the new system by y’*, we can write the operator 
Y'p in the form > y'*p,,. 


Introducing a generalized (non-unitary) transformation matrix L 
according to the condition 


yO= Ly, L 
(where y; = yz; ¥2 = Yy Ys = Ys)» We get 
3 
(yp = LY velpi—(j, L—L i) L~}] Ly, 


whence it follows that the transformed Dirac equation for the new wave 


368 WAVE MECHANICS OF A SINGLE ELECTRON § 36 
functions ~’ = Ly will differ from the original one in the same way as 
in the case of orthogonal coordinates, the operators p; = (h/271)0/0q, 
being replaced by 

Py = 3( 2 — 2 Woe L) = pil\—(og Lypi-} 

We shall not determine here the matrix ZL for the general case of 
non-orthogonal coordinates, for it is not of practical interest. 

The preceding results can be further gencralized by introducing four- 
dimensional transformations, involving not only the space coordinates 
but also the time. Such transformations can be used to include the 
effects of the gravitational field on the motion of the electron in 
accordance with the relativity theory of gravitation. These considera- 
tions lie, however, beyond the scope of the present book. 

In conclusion, the following transformation property of Dirac’s equa- 
tion should be mentioned. 

The electromagnetic field is represented in Dirac’s equation by the 
potentials A, ¢. Now from the relations E = —V¢—éA/cot, H = curlA, 
it follows that the electromagnetic field strengths are not altered if A is 
replaced by A’ = A+Vy and ¢ by ¢’ = ¢—6x/cot, where y is an arbi- 
trary function of the coordinates and of the time. Since it is the field 
strengths and not the potentials which have a direct physical meaning, 
the above transformation of the potentials must be irrelevant for 
Dirac’s equation; that is, the transformed equation 


[(m-ted'/ict+y(p—eA'/c)+ymc}p' = 0 
must be equivalent to the equation 


L(m+e¢)/e+y:(p—eA/c)+yymc}p = 0 

with the original potentials. This is easily verified, the transformed 
wave function ~’ being connected with the original one by the equation 
b' = e@nexihey,, So long as x is a real quantity (as of course it must be), 
the two functions, or rather function-quadruplets, y and ’ correspond to 
identical values of the probabilities and thus determine the same motion. 

This transformation can be considered as a special transformation 
of coordinates, the transformation matrix L being defined as the pro- 
duct of the matrix § = 1 by the function e#27¢x/e, It is clear that the 
coordinates are actually not affected by a transformation of this type.— 
We see at the same time that the introduction of our electromagnetic 
field can be described in a geometrical language as a generalization of 
ordinary coordinate transformations, the quantities (h/271)éL/éq, being 
replaced by eA,/c. 


VII 
THE PROBLEM OF MANY PARTICLES 


37. General Results, Virial Theorem, Linear and Angular 
Momentum 

The problem of many particles has been considered already in the first 
part (Chapter IV) on the basis of the non-relativity mechanics of a single 
particle. Using the method of the configuration space, we arrived, in 
the case of two different particles, at the equation (101), which in the 
general casc of a system of n different particles with the masses 
My, Mo,...,M,, and the potential energy U(x, yj, 2,3... Las Yn» 2n3t) can be 
written in the form 


nr 


1a. , 8r/h @ 
| p a er h2 fe ét v)\¥ i em) 
oe a @ 
Vi= ..+.—4+—. 
where k oni t oy +a 
Using the notation 
= Gi h @ Be 2g 
ee Pre = oF; ix,’ Pv = Ori dy,’ P= Oni te, 
(313 a) 
n 
1 
d — 2s Re l 
an H 2 im, PEt U, (313 b) 
we can rewrite (313) in the standard operator form 
(H+p,)p = 0, (314) 
h a 


p, denoting as usual — ee while H represents the energy operator 


ae Ot 
or Hamiltonian for the system under consideration. It agrees with the 
classical expression of the energy if the operators p, are regarded as 
representing the momenta of the separate particles. The wave- 
mechanical equation (314) thus corresponds to the classical energy 
equation H—W = 0 if —p, is replaced by the value of the energy W. 
This correspondence has exactly the same character as for a single 
particle, for which it hus been discussed in detail in Chapters I and II. 
We need not repeat here all that has been stated there, as well as in 
the following three chapters, concerning the matrix representation, the 
transformation, and the perturbation theory. It may suffice to remark 
that a system of particles, defined by the Hamiltonian (313 b), can be 


> 8595.6 3B 


370 PROBLEM OF MANY PARTICLES § 37 
dealt with from the mathematical point of view as a single particle 
moving in a space of 3n dimensions, with the coordinates 


q = =) xv qd: — id y q = | A 
: mpjv 7 ape m}? 
m my, 
“> JB oo Wn = Jer. (314 a) 


Here m is an arbitrary coefficient of the dimension of a mass, which 
can be regarded as the mass of the ‘equivalent’ particle. We can put, 
for instance, m = (m,m,...m,)/" which gives 


aV = dz, dy, dz,...dx,,dy,, dz, = dq, dqz...dqgn- 
The corresponding momentum components are defined in the classical 
theory by the formulae 


y= rye = ua m, Pizy +) = P3n 7 m, Paz: 


They are represented in the wave-mechanical theory by the operators 


as {f(s [fe 2 = (=) si 02 
a= (2 Piz = m,) 2mi dx,’ Pan = I \im,) Bari 82,’ 


that is, according to (314a), 
Seen. a 
«" Qnt aga 
just as in the case of a single particle. Expressed in terms of these 
coordinates and momenta the Hamiltonian (313 b) assumes the standard 
form 


(x = 1,2,..., 37), (314 b) 


3n 
H = => Pht U dyes dan) (3146) 
a=1 

All the developments of the first five chapters of this part, referring 
to the motion of a particle in ordinary three-dimensional space, can be 
immediately generalized for the case of a symbolic particle representing 
a system of n ordinary particles in the 3n-dimensional configuration 
space. The generalization is in fact so simple that it is hardly necessary 
to dwell upon it. 

We shall therefore limit ourselves to the discussion of a few pecu- 
liarities connected with the physical meaning of the problem and to 
the possibility of completing and refining the theory in the same sense 
as has been done in the preceding chapter for the case of a single 
particle. 

From equation (314c) and its conjugate complex (H—-p,))* = 0 


§ 37 GENERAL RESULTS 37] 
(H* = H) we can obtain in the usual way the ‘conservation’ or con- 
tinuity equation 


4 = 0, (315) 
ee iq, 
where p == yb* is the vrdbability aeaanye in the soricieatlen space, and 
jy = fh playa 
“2am 4" aq, 


the components of the 3n-dimensional probability current. If equation 
(315) is multiplied by the volume-element of the configuration space 
dV = d¥,d¥,...dV, = (—™ |? 
J 1 4V, n (aa 4 -} Aq, dq... Aq gn 


and integrated over all this space, the result obtained is 


d ; 


expressing the law of conservation of probability.t 

If, however, the integration is extended over the configuration space 
of the second, third,..., nth particle, while the coordinates of the first 
one, x, y, z, are kept constant, we obtain an equation of the usual 
three-dimensional form 


ag? Pit, ijuts jut shu = 0, (315 a) 


where the quantities 
p= [. [ pdvydNy...d¥, 


wo [= [irara, a som Bs [for aV, lV. ste. 

(315 b) 
can be interpreted as the probability density and current density for 
the first particle in the ordinary three-dimensional space. The same 
results hold, of course, for each of the other particles. 

In the particular case of a system of particles which do not act on each 
other the equation (H+ p,)¢ = 0 has multiplicative solutions of the 
form = yo, %,...4,, Where ¥, depends upon the coordinates of the ‘th 
particle alone; we get accordingly in this case 

Pr = Pei n= 5508; REV dy 
(provided the separate factors of ~ are normalized to unity) and con- 


+ We shall assume for the sake of simplicity that the integral J p dV is convergent, 
which means that the particles are bound to remain in a finite region of space. 


372 PROBLEM OF MANY PARTICLES § 37 
sequently p = p,pp..-p,- This result was the starting-point of our 
discussion of the problem of many particles in Part I. In the general 
case p is, of course, different from the product p, p....p,; this circum- 
stance corresponds to a mutual dependence of the particles, a depen- 
dence specified by the form of the potential-energy function U or also 
by statistical (i.e. symmetry) conditions, if the particles are all alike 
(see below). 
The function U may be assumed to have the form 


U= p3 U(r, t)+ rp} U yal x2); 


the first sum corresponding to the action of external forces, which can 
depend upon the time explicitly, while the second represents the mutual 
action of the particles (U,, = U,, rj), == |¥;,—T,| = distance between 
the Ath and /th particles). 

If U does not depend upon ¢, then equation (314) admits solutions 
of the form % = Y%,(x,,...,2, e777", where #?,, and H’ are the charac- 
teristic functions and the characteristic values of the energy operator 
satisfying the usual equation 

(H—H’')p?,, = 0. (316) 
In the case of a discrete spectrum of H the functions ¥°,, are easily 
proved to be orthogonal to each other (in the configuration space), this 
orthogonality being a consequence of the self-adjoint character of the 
operator H, since 
Si Afe—fe Ah, = eo > (7,22 fail") 

Another interesting consequence this self-adjointness of H is the 

possibility of replacing the preceding equation by the variational 


equation, 5S | p*Hy dV = 0, (316 a) 
with the accessory condition, 
J yorye dV =: 1, 


expressing the i ga gaa! of the functions #7,,. Using 


Joger-- [eee 


(316 a) can be rewritten in cn form 


3 | E aa S Soe id Uy] dV == 0, (316) 


which involves the first sertvatives of % only (dV = dq,...dq3n). 


§ 37 GENERAL RESULTS 373 

An interesting application of the variational equation is afforded 
by the following very simple and general proof of the virial theorem 
(due to V. Fock). Let us replace the function #$°(q,,...,¢s,), which is 
a solution of our problem, by the function yp’ = cf(Aq,,...,Ags,), Which 
is obtained from it by multiplying each coordinate by a certain 
parameter A and introducing a normalizing factor c. Introducing 
further a new sct of coordinates q, = Aq,, we can write the normaliz- 
ing condition f f’*}’ dV = 1 in the form f A-3"4'*f’ dV’ = 1, where 
dV’ = dqj...dq;,, which gives, on using the original normalizing con- 
dition, 


J vE(ad%(g) AV = [ por(g’ Cg’) dV’ = 1, 
c = A-i”, Using this value of c we can reduce the variational equation 
(316 a) or (316b) to the form @H'/0A == 0, where H’ is the value of the 
integral (3164) or (316b) which is obtained by replacing the function 
y° by the function ~’. Its minimum value corresponds, of course, to 
\ = 1, which is the solution of the equation @/1’/@ = 
Now using the coordinates g;, we have 


WT A? Ss apn (q’) ofP(q') 
Lane, Se Ne a NEL eS 
ibe } f 8x%m 2a a 


UOMO Wd )| av’, 


so that the shia equation assumes the form 


2 O* 0 Sn U ; 2 
| [> sin 2 ae 2 OL = 0 


8ar2m 0q;, 
Putting here . = 7 and g, = qq, we get 


3n ary 
oT — Sees (317) 


where 7 denotes the probable (average) value of the kinetic energy of 
on mr) bf 

the system, and = iis its ‘virial’. We have obviously 
—~ Yq 


3n n 
aU _ OU 0U noe). 
If the potential energy is a eee function of the coordinates, 
this expression reduces to the product of U with the number specifying 


the corresponding power. In the special case of a system of clectrified 
particles obeying the Coulomb law—which is approximately the case 


374 PROBLEM OF MANY PARTICLES § 37 
with any actual material system constituted by protons and electrons, 
we must have wT = —T (317 a) 


or, since 7+U = W (= total energy of the system), 
T =~—W. (317b) 


It should be remembered that these relations hold only so long as 
the particles remain actually bound to each other, which is expressed 
mathematically by the convergence of the integral f ||?dV, a con- 
vergence that subsists so long as the energy W of the state under 
consideration belongs to the discrete spectrum. It should further be 
remarked that they remain valid if some of the particles are treated as 
fixed centres of force producing an ‘external’ Coulomb field of force. 

We shall now establish a few other general laws which hold for a 
closed system of particles, i.e. a system unaffected by external forces, 
such as an isolated atom or molecule, etc. 

These laws are the exact equivalents of the laws of classical mechanics 
concerning the conservation of the energy, momentum, and of the 
moment of momentum (or angular momentum) of the system. The 
first of them has been stated already. The other two can be established 
with the help of the relation 


271 


dF 
— = [HF] = (F—FH). 
We put F=p=Sp, 
1 


or F=M=31r,xp, 
1 


in accordance with the classical definition of the total momentum and 
angular momentum (the origin from which the vectors r, are supposed 
to be drawn can be chosen arbitrarily). 

Taking the z-component of p, we have 


[.p.] = [U.p.] = £(UsPu] = — > 5 


Now —0U/éx, represents the force acting on the kth particle in the 
direction of the z-axis; so long as there are no external forces, the sum 
of such forces for all the particles must obviously vanish. Hence we get 


nu? = ° 


§ 37 GENERAL RESULTS 375 
In a similar way we have 


[H, M,) = p32 [H,Yp Pec— 2% Pry] 


a Pi {{A, Yu) Pret yl, Pre\— [H, 24 | Pry te, Pryl}; 
that is, since 


oH | oH oU 
D = ES OS z = —_ Sg t ” 
oan) = oo Pi Hepa] = — 5 = — Fete 
oU oU 
[H, M. = = sh ~ (Pry Pas Piz Pry) — > (vz, —*oy,) 


The first sum attalice since p,, and p,, commute with each other, while 
the second is equal to the z-component of the vector p3 (r;,x F,), where 


F, is the force acting on the kth particle. We thus at 
d 
— M = F,., 
7M p r, Xx F, 
just as in the classical theory. It is easy to see that in the case of 
central forces, which we are considering, the vector >} r, x KF, (repre- 
k 


senting the resulting torque of all the forces acting on all the particles) 
vanishes. We have in fact, putting F, = p> F,,, and taking into account 
that F,, = —Fy, = fult,—m), 
F; = 7 .——y F.. = 0. 
ps r,xF,=4 % 2, (re r) x Fy 
Hence it follows that cM = 0, 
i.e. the conservation law for the resulting angular momentum. 

This result, as well as the preceding one, can be obtained by another 
method based on the invariance of the energy operator with regard to 
a transformation of the coordinates (and momentum components) in- 
volving a shift of the origin and a rotation of the axes about it. Let 
P be some fixed point in space (or in the configuration space) and P’ 
another point which in the new system has the same coordinates as 
P has in the old one (x; = z,, etc.). If f(P) is some function of the old 
coordinates, then the transformed function will be defined by the con- 
dition Tf(P) = f(P’), T denoting the transformation under considera- 
tion. The coordinates of the point P’ in the original system are defined 
by the linear transformation equations 


Ly = Ly yy T+ yg Yt yg 2% 
Yur = Yot Mg Lt Oy Yj. + Mg 2}, (& = 1, 2,..., 2). 
Bye = qty, Tit G2 Yu Ogg 2, 


376 PROBLEM OF MANY PARTICLES § 37 
In the special case of an infinitesimal transformation these equations 


reduce to 
L,—2, = Sa, = Lot wy 2,—wWz Yj, 


Ya— Yk = Yn = Yor, Te— wz % 

2,—%, = 82, = Zot wy Yj — Wy Xj, 
where w,, w,, w, are the components of an infinitesimal rotation w, 
while 29, Yo, 2) are the components of an infinitesimal displacement ry. 
We obtain in this case 


TIP) = f(P)+ > (z ba + By, + 2 be) 
k 


OY, 
ae all a an, +¥0 D. Z +29 et 


+w, “5 (na = Fis nS (ok “ue )+ 


f yA 
+0, 2, (24 ‘oy, —%ir) 
the derivatives of f being taken for the point P. Neglecting small 
quantities of the second order, we can replace in this equation the 
primed letters (referring to P’) by the unprimed (referring to P), 
which gives mri 


TP) = f(P)+ > [top+oM)fI,, 


where p = ) p,, while M denotes, as before, the operator of the 
k 


resulting angular momentum. We thus sce that an infinitesimal trans- 
formation 7’ can be represented by an ordinary linear differential 


operator Qri 
T= 1+" (ryp+w-M). 


Now it is obvious from symmetry considerations that the energy H 
remains invariant under a transformation of the type 7' since the latter 
alters neither the value of the potential energy (depending on the 
relative position of the particles only) nor the expression of the kinetic 
energy operator (the operators Vj being independent of the orientation 
of the coordinate system or of the position of its origin). This circum- 
stance can be expressed by the condition TH, = HTy, that is, 
HT = TH, which, on the other hand, means that the operator 7' repre- 
sents a constant of the motion. In view of the arbitrariness of the 
(infinitesimal) vectors ry and w, the equation 7’ = const. is split up 


§ 37 GENERAL RESULTS 377 
into two independent equations: p = const. and M = const., expressing 
the conservation law of the resulting linear and angular momentum of 
the system. 

These laws are, of course, no longer satisfied in the presence of 
external forces. If, however, the latter reduce to an attraction to a 
fixed point—as in the case of a system of electrons revolving about 
a fixed nucleus supposed to act like a point-charge—then we still have 
M == const. In the presence of a homogeneous field—-magnetic or 
electric—parallcl to a fixed direction in space, the energy operator 
remains invariant for rotations about this direction only, and we obtain 
accordingly the conservation law for the corresponding projection of 
the angular momentum, the components of the latter in the perpendi- 
cular directions being no longer constant. 

The operator M, = r, xp, representing the angular momentum of 
a single particle satisfies, as we know, the relation 


M, x M, = _*. 


M,. 
2Qni * 


Replacing M,, by the resultant angular momentum operator M = } M,, 
k 


we have 


MxM = 5 M,xMy+ > 5 (M.xM,+M)x My) = 5M, xMy, 


since the operators M, and M, referring to different particles obviously 
commute with each other. We thus get for the resulting angular momen- 
tum the same relation as for the component ones, viz.: 


MxM = es 7 (318) 
2rt 

It has been shown in Chap. III, § 13, that it is possible by means of 
the matrix method to derive from this relation the matrix elements 
of M in a representation specified by the condition that M* and M, 
should be diagonal matrices (corresponding to a given value of the 
energy). The number of particles involved is obviously immaterial (so 
long as M commutes with H) and the results obtained before for the 
case of a single particle can be directly applied to the present case. 
We thus obtain, on denoting the angular quantum number by ) (instead 
of / as before) and the axial one by m, 


h? 
Mis = FIT), (318 a) 


3595.6 30 


378 PROBLEM OF MANY PARTICLES § 37 


h ’ P 
(MM) si = a— (—J <m <)) 
7 


(M,+iM) worm = eMG+D—(mt yer |, (818) 


: h . Bhicad 
(Mf,—tMy) nmr = rie) +-3)?—(m-+ 3)?}e~*o™ 


[cf. e.g. (96) and (96a)]. As has been pointed out in § 13, the number 
j can assume, from the matrix-theory point of view, both integral and 
half-integral values (the values of m being of the same nature); half- 
integral values occur, however, only if the spin of the particles is 
included, and if M refers to total not orbital angular momentum. 


38. Magnetic Forces and Spin Effects 

A generalization and refinement of the preceding theory along the same 
lines as for a single particle—i.c. the establishment of a wave equation 
(H+-7,)6 = 0 which would describe the behaviour of a system of 
particles in agreement with the relativity theory, taking account of 
magnetic forces and of the spin effect-—is a problem which admits only 
of a partial and approximate solution. This circumstance is not charac- 
teristic of the wave mechanics, for we mect with a similar situation in 
the classical mechanics. The latter can be formulated in a relativistically 
invariant form for the case of a single particle moving in an external 
electromagnetic field—that is, in a field which is supposed to be known 
a priori and specified by the potentials ¢ and A. The more general 
problem of the motion of two or more particles, acting on each other 
according to the laws of the classical electromagnetic theory, cannot be 
solved with the help of a single equation involving the coordinates of 
all the particles for the same instant of time, for according to this 
theory the action emanating from each particle travels through space 
with a finite velocity (c). The force acting on a particle (1) at the instant 
t depends upon the position and motion of the other particles (2, 3,...) 
at previous instants t,, = t— R,,/c, etc., Ry. being the distance between 
the point where (1) is at the time ¢ and the point where (2) was at the 
time ¢,>. 

This fact, usually denoted as the law of retarded action, alone pre- 
cludes the possibility of treating the problem of motion and interaction 
of a number of particles by means of a single equation of the Hamilton- 
Jacobi type. We must, instead, write the relativistic equation of motion 
for each individual particle assuming the electromagnetic potentials 


§ 38 MAGNETIC FORCES AND SPIN EFFECTS 379 
produced by the other particles to be known, and furthermore a set of 
equations defining the potentials produced by each particle, its motion 
being supposedly known. 

This problem allows, however, only an exact formulation. It cannot 
be solved exactly even for the simplest case of two particles. And 
there is no doubt that such a solution, if it could be obtained, would 
be in contradiction to the experimental facts. Assuming that the latter 
can be described adequately, so far as the motion of a particle in a 
given external field is concerned, by means of the relativistic wave 
mechanics, we must find a method of describing adequately the electro- 
magnetic field produced by a particle, whose motion is specified in 
terms of wave mechanics, i.e. in terms of the probability theory. This 
means that together with the classical mechanics we must abandon the 
classical electrodynamics, based upon the idea of exactly specified motion, 
and replace it by a new ‘quantum electrodynamics’, not involving 
this idea. 

We shall consider this problem more closely later on (Chapter IX) 
and shal] confine ourselves here to the more modest task of incor- 
porating into the wave-mechanical theory the magnetic forces, and 
other effects connected with them, neglecting those which are due to the 
retarded character of the interaction between the electrified particles— 
electrons and protons—constituting matter. 

So far as the action of an external magnetic—or electromagnetic— 
field on a system of such particles is concerned, the required generaliza- 
tion of the previous theory presents no difficulties. We have merely 
to replace in the expression of the energy operator the momentum 
operators of the single particles p, by the differences 


ex 
Pp; = ie A, 


where A, = A(2;,, 4,2), ¢) is the vector potential of the external field 
at the point where the particle in question is supposed to be situated at 
the instant ¢ under consideration. 

Putting further U = }e,¢,+U', where $, = $(x,,Y,,2t) is the 
scalar potential of the external field at the point (x,,¥,,2,), and 


‘=> Pes Ci fk eae mutual potential energy of the particles, we get 


i<k 
n 


ap Ex (P.—“tA,)’ +t |+> ee: ilk (319) 


i<k Tik 


380 PROBLEM OF MANY PARTICLES § 38 
In the case (usually met with in practice) where the square of A can 
be neglected as well as div A, this expression reduces to the form 


| ae Cie y 
Hm > lato AxPiterds| + BP ee , (319 a) 


i<k 
We mect a much more difficult problem when we try to incorporate 
in the energy operator terms representing the non-statical interaction 
of the particles with each other. This problem can be solved approxi- 
mately if we neglect the retarded character of the electromagnetic 
actions and define accordingly the vector-potential produced by a par- 
ticle with a charge e; and velocity v, at a distance 7; by the expression 
e; Vi/ (crix). 

The total value of the vector potential A, at a given point (k) is then 
equal to the sum of the part Aj due to the external field and that 


A, = > *¥i que to all the other particles. The total momentum of 
ive tk 
the kth particle, p, = m,V;,+(e,/c)A,, is thus given by the expression 


Px = D> Ii Vit (e,/c)AR, (320) 
where g,; = m,, if 1 = k and e,e,/(c*r,,) if 1 # k. 


The corresponding expression for the total kinetic energy 7’ of the 
whole system [equal to the sum of the ordinary kinetic energy p3 4m, Uz 


and of the mutual kinetic energy 7” = 3 p (e,/c)v,"A;] is 
Putting p,—(e,/c)A, =p, and solving the equations py = > 94; V; 
with respect to the v,’s, we get v, = p32 g'*p,, where g** = g-1(ag/09;,), 
g being the determinant |9;,|, and 

P= td DOP: Pi (320 b) 
The classical Hamiltonian H is equal to the sum of this expression and 
the potential energy U = > e,¢2+U’. The simplest way to obtain 

k 


the corresponding quantum Hamiltonian consists in replacing the p’’s 
in (319b) by the operators (h/27i)V—(e/c)A®. Since, however, these 
operators do not commute with the coefficients g** we might just as 
well write p;g'*p,. instead of g**p;p, or, more generally, f-'p;g**fp,, 
where f is any function of the coordinates. If (following L. Landau) we 
put f = g we obtain for the quantum 7' an operator which can be 


§ 38 MAGNETIC FORCES AND SPIN EFFECTS 381 


considered as a generalization of the ordinary Laplacian in a curved 
space with the line-element ds? = p3 p2 9:49; 4q;,. 


We shall now discuss some further complications of the theory of 
a system of electrons, namely, those connected with the spin effect. 

In the case of a single electron or proton this effect can be accounted 
for approximately by introducing, in addition to the three space co- 
ordinates of the particle z, y, z, a fourth ‘spin coordinate’ f, able to 
assume two valucs only. These values correspond, as we know, to two 
opposite orientations of the spin parallel to a fixed direction, that of 
the z-axis say, or, more exactly, to the two characteristic values of the 
z-component of the spin matrix o,. We thus get, instead of a single 
wave function (x,y,z) describing the motion of the particle in ques- 
tion, a function doublet (x, y,z,¢) which can be dealt with as a linear 
two-dimensional matrix with the elements ¥,(z,y,z) = (x,y,z, 1) and 
U,(x, y,z) = W(2,y,2, 2), 1 and 2 being the two values of ¢. Instead of 
these two values it is often more convenient to use —} and +4, which 
are equal to the respective values of the z-component of the spin angular 
momentum expressed in the standard h/27 unit. 

The energy operator, as well as all the other operators referring to 
the particle, must be defined accordingly as a square two-dimensional 
matrix involving either the spin matrix o or the unit matrix which is 
equivalent to the square of any component of o. 

These results can easily be generalized for a system of elementary par- 
ticles (electrons or protons) so long as their mutual action is neglected. 
The wave function describing the behaviour of the whole system can 
be defined as the product of the functions ¥, = W(X, Yx, 2,» $,) referring 
to the individual particles (k == 1, 2,...,n). The expression ¥* multi- 
plied by the volume-element of the configuration space dV = dV, ...dV,, 
= dx, dy,dz,...dx, dy, dz, is to be regarded as a measure of the proba- 
bility of finding the system in the corresponding configuration with 
the specified values of the spin coordinates. The number of such 
specified values is obviously equal to 2", so that there are 2" states 
corresponding to each configuration and differing from each other by 
the orientation of the separate particles inasmuch as this orientation is 
specified by the characteristic values of o,. The total probability of a 
given configuration, irrespective of the orientation of the particles, is 


measured by the sum 
BE Ee Bvt 


3 


extended over the two possible values of each of the spin coordinates. 


382 PROBLEM OF MANY PARTICLES § 38 
In the case of a motion belonging to a discrete spectrum this sum must 
be normalized according to the condition 


[yvvar =1, 


the integration being extended over the whole configuration space. 
With regard to the definition of 4, we have 


JLwvav = [Vo ay. | Porbadh 
| 2 ie = | i dist oi ua) di, = 1, 


where 

Puts = Per (Te Yae2 1) = Byler Yas Zeer Oe) (¢, = 1,2). 
The product ¥ considered as a function of the space coordinates alone 
can be dealt with as a linear matrix of 2” dimensions 


py a $3, Per, «+ Pat, 


This involves the use of operators which should be defined as square 
matrices of the same rank. Such an operator, F say, can be defined 


by the equation (Fp) = ; Forde 


where Fy. is an operator of the ordinary kind with respect to the 
space coordinates 2,,...,z, and the corresponding momenta, specified by 
two sets of particular values of the spin coordinates, f’ = (¢/, Z;,..., f/) 
and ¢” = (¢1,%,....0;). Each of the individual wave functions y, 
satisfies the matrix-operator equation 
(Hi, +8, Pp, = 9, 

where 8, is the two-dimensional unit matrix referring to the {th particle 
(with the elements 5,. ;-). The factorized wave function is easily seen 


to satisfy an equation of the same type, 


(H+8p,)h = 0, (321) 
where 5 is the 2”-dimensional unit matrix with the elements 
Byte = By or By: tg + gees, (321 a) 


and H the energy operator defined by the formula 
Hyp, = H ? Oy 2... Oy -+8y, ~ H, «ee Ope pe + 
ve 6 Sd Fahd <1 id 6 s 6 elie 1 69 ,  (321b) 
Ho Hg, 5 + Bee, Gens Antec 


H,x; being the elements of the ordinary two-dimensional matrix 
operator referring to the kth particle. 
Equation (321) can naturally be extended to functions ¥ of a more 


§ 38 MAGNETIC FORCES AND SPIN EFFECTS 383 
general type, equal, for instance, to a sum of particular solutions of the 
simple product type. It can be further generalized in order to account 
for the mutual action of the particles by adding to it terms representing 
the interaction energy multiplied by the unit matrix (321a). (In pro- 
blems of the atomic theory involving only a small number of electrons 
the mutual kinetic energy 7” can be neglected.) There remains, how- 
ever, still one step in this generalization, which consists in the addition 
to the interaction energy of terms characteristic of the spin effect. We 
can solve this problem in a tentative way with the help of the approxi- 
mate theory of §30. We found there that the additional ‘spin’ force 
acting on a particle (electron) in a given clectromagnetic ficld E, H can 
be derived from the energy operator 


1 
~ploH op 2moc E-(p x °)| 


[cf. equation (261 a), where u is replaced by p]. It is natural to suppose 
that this result will still be valid for a system of particles, if H and E 
are defined as the total field acting on the given particle due both to 
external causes and to other particles constituting the system. The 
field E,H produced by a certain particle at a distance r can be derived 
in the usual way from the potentials ¢, A defined by the following 


formulae: j F ie 
om, A=. -- p+exr. 
$ r Mg erP . ae 
The first term in A represents the ordinary electromagnetic field of 
a moving point-charge, while the second is introduced as an equivalent 
for the field produced by an elementary magnet with a moment yo. 
Neglecting the electric field duc to the variation of A with the time, 


i.e. puttin 
ee B= —V$= Sr 


7 eas, 13 so-r)— 
and H = cuwlA = aaat XP +5] Sr(0 r) °|, 

we get for the operator of the spin interaction energy the following 
cxpression: iE Se 5 (322) 

ae He ei ee ee 
where -U, = oe? = on [rux(—Pits,-Pu)| (8228) 

{3 

and Us = Hat = (6 Tyi)(Oe Tyg) —F,'S. | 322b 
Zs A lg: Se Fe T ed — Owe (322 b) 


In deriving the term U; which represents the linear or electromagnetic 


384 PROBLEM OF MANY PARTICLES § 38 
part of the spin interaction energy we have simply summed up the 
contributions of all the particles concerned (r,; denoting the radius 
vector from the ith particle to the kth), whereas in deriving the 
quadratic or purely magnetic part of the energy U%, which is sym- 
metrical with regard to each pair of particles, we have taken each pair 
only once (as indicated by the condition k < 2%). 

It should be noticed further that in adding U, to the Hamiltonian 
H in (321) we must multiply it by the unit matrix (321 a). This amounts 
to the multiplication of each term by those two-dimensional unit 
matrices only which refer to other particles than those represented by 
the matrices o. Dropping these unit matrices and neglecting the 
mutual kinetic energy we can write the total Hamiltonian in the form 


H => H,+U'+U,, (323) 
E 
where U, is defined by (322), while 
5 See ee ee = 
Hy, 2m, 8m}, oP cm, AL Pete Pr ; (323 a) 


1 
~ | Hye, 0 Qm,.c E,..(P, x °,)| 


H,, is the energy operator for the kth particle, A,,¢,, H,, and E, denoting 
the potentials and intensities of the ezternal electromagnetic field at 
the point (z,,4,,2,). If this field does not depend upon the time, the 
equation (H-+>:,)~ = 0 admits solutions of the type ~ = #9, e-t7tUh 
corresponding to a motion of the system with a fixed energy H’, the 
function 3,. being defined by (H—H’)$?,, = 0. To each state or energy- 
level defined by the approximate equation to which it reduces if the 
spin effect of all the particles is neglected, there correspond, in general, 
2” different states with slightly different energy-levels, which form what 
is called a ‘spin multiplet’. The theory of such multiplets for the 
simplest case of a single particle has been discussed in the preceding 
chapter. The general results stated there (§ 29) about the orthogonality 
properties of the functions ~°,,, the matrix and supermatrix representa- 
tion of various physical quantities, the perturbation theory, and so on 
can easily be extended to the case n > 1. We shall not discuss these 
questions here, but shall leave some of them for a later section where 
they will be considered in connexion with Pauli’s exclusion principle for 
identical elementary particles (electrons or protons). 

The method which has been applied above for the description of the 
spin effect characteristic of such particles can be used in a somewhat 
generalized form for the description of the orientation or inner states 


§ 38 MAGNETIC FORCES AND SPIN EFFECTS 385 
of complicated particles—such as atomic nuclei or whole atoms, etc.— 
so long as they are treated as moving material points. 

Let us consider, for example, a particle possessing an inner angular 
momentum (which may be due both to orbital and spin motion of the 
electrons and protons constituting it) of s units. Such a particle can 
assume 28+ 1 quantized orientations, corresponding to the values 

m = —8s, —(s—1), —(s—2),..., +(s—1), +8 (324) 
of the z-component of s. These numbers can be defined as the charac- 
teristic values of a matrix o,, which is the z-component of a matrix a 
representing the inner angular momentum of the particle in question 
(in units of 4/27). The matrix elements of o,, o,, and a, are defined 
by the equations 

(G,+10y)miiym = V{(8t+3)?—(m-+ 3)*}ee™ 

(0,—10,) m,m+1 = V{(s+4)2— (m+ 4)*}e-t0m ? (324 a) 

(2) mm == ™m 

which are obtained from the equations (94b), (96), and (96a) of § 13 
(Chap. III), if M is replaced by he/27 and 1 bys. The motion of such 
a particle in a given external field of force can be described in exactly 
the same way as this has been done above for the particular case s = 3, 
namely, by introducing in addition to the ‘external’ coordinates z, y, z, 
defining the position of the particle’s centre of gravity of an ‘inner’ 
angular momentum coordinate ¢, which should assume the values 
1, 2,...,28-+1, corresponding to the characteristic values (324) of o,. If, 
moreover, the additional energy of the particle in a magnetic field § is 
represented by the operator pS-o, we get a direct generalization of the 
Pauli theory of the spin effect, discussed in § 29. A similar generaliza- 
tion is obtained if we consider a system of particles—such as electrons 
and atomic nuclei—which differ from each other not only with respect 
to the charge and mass, but also in respect to the inner momentum 
number 8 or the multiplicity 2s+-1. A problem of this sort is met with, 
for instance, in connexion with the hyperfine structure of atomic 
spectra, due to the fact that the nuclei of many atoms actually possess 
an inner angular momentum and a very small magnetic moment asso- 
ciated with it. The magnetic field produced by the latter can: be 
specified by a vector potential of the same form, 


A= Fexr, 
r 


as for an electron (or proton), giving rise to an interaction energy of 
3695.6 3D 


386 PROBLEM OF MANY PARTICLES § 38 
the type (322a,b), with o, denoting matrices of various ranks (2 for 
an electron; 1, 2, 3, etc., for a nucleus). 

These considerations show, by the way, that an electron can be 
visualized not as a point but as a spinning sphere, according to the 
classical model, in spite of the fact that in the Pauli or Dirac theory 
it is treated as a point. 


39. Complex Particles treated as Material Points with Inner 
Coordinates; Theory of Incomplete Systems 


Complex particles can be treated as elementary, i.e., materia] points if 
inner coordinates and momenta are introduced to specify their orienta- 
tion, the total value of the inner angular momentum, if it is variable, 
as well as other quantities, serving to describe their inner properties. 
Let us denote by z (x,y,z) the coordinates of the centre of gravity 
of the particle, the coordinates specifying the relative motion of the 
elementary particles (electrons, protons) constituting it being denoted 
by ¢ (41, 9)--.). Let us divide further the energy operator H into three 
parts, K, L, M, where XK is a function of the x’s (and of the associated 
é 


momenta represented by the operators = =) LZ a function of the q’s 
TT 


cia © as well as of the spin variables), 
27 Og 


while © is a function of both. We shall assume them all to be inde- 
pendent of the time and shall denote the characteristic values and 
functions of Z by L’ and x;-(q) respectively. 

The solution of the equation (H—H’)y,;,, = 0 for a stationary state 
of the complex particle (supposed to move in a given external field of 
force) can be represented in the form 


dr = 3 d1(@)xx(a), (325) 


where ¢,(z) are certain expansion coefficients with regard to the 
variables g, being themselves functions of the variables x. These 
functions can be determined by substituting (325) in the equation 
(H—H')by = 0, which gives 


» (Koy)xy+(L'—F or xr +Uxr br = 9. 


Now the operator M applied to the function y,, and thereafter to the 
function ¢,, gives the same result as the operator 


p2 My - 1 xr: 


(and of the associated momenta 


§ 39 COMPLEX PARTICLES 387 
acting directly on ¢;,, where 

Myy = f xi- U xz dq 
are the matrix elements of M with regard to the characteristic inner 
states of our complex particle (these matrix elements are functions of the 


Ox 
> xr K¢oy+(L’—H')br-]+ p3 p? xr My 1- dr = 9, 


or, interchanging the summation indices in the double sum and equating 
to zero the coefficients of the functions x,,, 


x and, in general, of the associated operators =. =). We thus get 


K¢y+ p My 1dr. = (H'—L')$y,. (326 a) 
The system of equations can be written in the form of a single ‘operator- 
matrix’ equation Jd = J'$ (326 b) 


if ¢ is defined as a one-column matrix with the elements ¢,, and J as 
@ square matrix operator with the elements 

Jp = K8p7-+Myr-, (325 c) 
5,,- denoting the unit matrix and J¢ the one-column matrix resulting 
from the matrix multiplication of J by ¢; J’ = H’—L’ are the charac- 
teristic values of J. We can also regard ¢ as a vector and J as a tensor 
in the state-space, corresponding to the inner motion (and orientation) 
of the particle under consideration, and specified by the quantum 
numbers L’' (which must include besides the energy other constants 
of the inner motion). We can finally regard L’ as a sort of ‘inner’ 
coordinate (or coordinates) of the particle so long as it is treated as 
a material point—in the same sense as this is done in Pauli’s or Dirac’s 
theory of the spinning electron, with the only difference that the number 
of possible values of L’ is in general infinite, instead of being equal to 
2 (as in Pauli’s theory) or to 4 (as in that of Dirac). The ‘inner’ quantum 
numbers corresponding to these additional coordinates in the functions 
¢(x, L’), compared with the functions ¢,-(x) which are the solutions of 
the ‘unperturbed’ equation (K—K’')¢,(x) = 0, can be represented by 
the values of the difference J’—K’ for the same value of K’. 

The different solutions of the equation 
Iby = Ibs 

i.e. solutions referring to different values of J’, if quadratically in- 
tegrable, are orthogonal to each other and can be normalized according 


to th ti 
eters [ thd de = By 5, (326) 


388 PROBLEM OF MANY PARTICLES § 39 
where ¢1, is the one-row matrix formed by the elements which are the 
conjugate complex of those constituting the one-column matrix ¢,. 
Introducing L’ as an inner coordinate, we can rewrite the preceding 
equation in the form 


{ 5 Hh (, Lyle, L') de = bys. (326 a) 


This result easily follows from the self-adjoint character of the operator 
matrix J, which in its turn is a consequence of the self-adjoint character 
_ of the complete Hamiltonian H. 

All quantities referring to the translational motion of the particle 
under consideration must be represented by operator-matrices of the 
Hype h a nh O ee 
F(x ae =) se (x rk ) 
the inner coordinates appearing twice—in the role of ordinary co- 
ordinates, and in that of the momenta. The matrix element of such 
a quantity with regard to two states of motion, specified by the func- 
tions ¢, and ¢,., is given by the expression 


F,.). = J $4, Fd,-dx = | dx SF OF(e, L')F(L', L'\$,(2,L"). (326b) 


This expression is a generalization of those appearing in the theory of 
Pauli and Dirac, with the inner (‘spin’) coordinate assuming two or 
four values only. 

Let us suppose, for example, that the particle is an ion (charge e, 
mass m) moving in an electrostatic field, which within the particle can 
be dealt with as practically homogeneous and equal to E = —VV(z, y,z) 
where V(x, y,z) is the electric potential at the point (centre of gravity) 
representing the particle. We then have, by the ordinary Schrédinger 


theory, h2 
n ViteV (z, Y; z), 


8m 
as for an elementary particle with a charge e and a mass m, and further 
M = —E(z, y,z)-P(q), 


where P is the resulting electric moment of the particle, the position 
of the electrons and protons being referred to the point (x,y,z). The 
operator L which specifies the inner motion of the particle—in the 
absence of the external electric field—need not be considered here. All 
we need to know are the matrix elements of P with regard to the 
stationary states representing this inner motion, the translational 


§ 39 COMPLEX PARTICLES 389 
motion being determined by an equation of the type (325a) with 
Mp. = —E(x)-Prx-. 
For a particle moving in an inhomogeneous magnetic field (a problem 
met with, for example, in the Stern-Gerlach experiments), we get in 
a similar manner 


2 
A a La 


aad 14 nr My yp. = —9(2) by r, 
rz being the matrix elements of the resulting magnetic moment of 
the particle. 

The preceding theory can be easily extended to the general case of 
a system of complex particles, considered as material points, or to the 
still more general case of any ‘incomplete’ system A, which is a part 
of a complete system AB, specified by the Hamiltonian H. If the part 
of H corresponding to A taken alone is denoted by K, that corre- 
sponding to B with L and the rest, representing the mutual action or 
‘coupling’ between A and B with M, we obtain for the motion of A the 
same results as before. the coordinates x specifying in the general case 
the configuration of A, and ¢(x, L’) being the probability amplitude of 
this configuration for a given stationary state L’ of B. 

In the case of two particles, for example, we have, denoting by 2,, 2, 
the coordinates of the respective centres of gravity and by q,, g2 the inner 


di 5, = 
SOOPAINANSE x09) = xeAV)X:(Qa)> (327) 


since the operator of the inner motion (without interaction) Z obviously 
reduces to the suin of the corresponding operators L, and L, for each 
of the two particles taken separately. Putting further 

$r(X) = $r;1,(21 Xa); (327 a) 
we obtain for ¢ an equation of the same type as before. If the two 
particles are treated with regard to their mutual action as electrical 
dipoles, their mutual potential energy will be represented by the 
operator 173 
M == “i tt Poe P,)—Py Pal 
where r is the radius vector drawn from one particle to the other (with 
the components r,—2,, etc.), whence 


1f3 
oe =| sa? Pazces)(® Paziza) Pues Paasn| (327 b) 


It should be noted that in spite of the incompleteness of the system 
A, specified by the energy operator K+ M which represents its own 


390 PROBLEM OF MANY PARTICLES § 39 
energy and the action on it produced by the ‘ignored’ part B, the motion 
of A is exactly determined if the operator M is defined as a matrix 
with regard to the stationary states of B. This method of describing the 
motion of an incomplete system A is especially convenient if its coupling 
with B is relatively weak and if for some reason we are not concerned 
with the details of the motion of B. As a further example of a (rather 
unconscious) application of this method we shall mention Fermi’s theory 
of the hyperfine structure of spectra, due to the mutual action of an 
electron (A) with a nucleus (B) possessing a magnetic moment. The 
motion of the electron is determined in this theory with the help of 
Dirac’s equation, the action of the nuclear magnetic moment on the 


electron being represented by the vector potential A = re xr, where 


o is the well-known matrix of rank 2s+1, specifying the angular 
momentum of the nucleus hs/27. The wave function y% must be treated 
accordingly as a rectangular matrix with four columns (corresponding 
to the four components of the Dirac wave function) and 2s+1 rows.— 
We shall discuss later another interesting application of the same 
method (due to Heisenberg) to the problem of the interaction between 
matter (A) and radiation (B), the latter being described by ordinary 
electromagnetic oscillations, whose amplitudes are treated as matrices 
(Chap. 1X). 

If the interaction energy & is relatively small so that the second 
term on the left side of the equation, 


K¢(za, L’)+ p M(L’, L")¢(x, L") = (H’—L’')d(z, L’), 


can be treated as a small perturbation, this equation can be solved 
approximately with the help of the ordinary perturbation method 
starting with the solution of the equation which is obtained by dropping 
the term M. More exactly, since our problem becomes degenerate, we 
must consider the whole set of solutions corresponding to the same 
unperturbed energy-level H’—L' = K’. Writing (K’,L’) for J’, where 
L’ denotes an inner quantum number independent of L’ but identical 
with it in regard to the range of its possible values,t we can define an 
orthogonal and normal set of solutions of the unperturbed equation 
K¢ = K’'¢ by the formula, 

dx i(&,L') = wx(z)8z 1, (328) 
where wx(x) denotes the solution of the above equation leaving out of 


t In the same sense as the spin coordinate { = 1,2 and the spin quantum number 
A =: ], 2 for a single electron of the Pauli theory (cf. § 29). 


§ 39 COMPLEX PARTICLES 391 
account the inner coordinates, while 5;,7, are the elements of the unit 
matrix. The function w,;(x) is supposed to be normalized according to 
the ordinary condition { |w,;(x)|* dx = 1; it is supposed, moreover, to 
be the only solution of the ordinary Schrédinger equation Kw = K’w 
corresponding to the energy-level A’ (so that no further degeneracy 
outside of that which is specified by the quantum numbers L’ need be 
considered). 

The approximate solutions of the exact equation, ‘stabilized’ for the 
perturbation M, can be defined, according to the general theory, as 
linear combinations of the functions (328) 


$(2, L') = 2 cy bxz(a, L’). (328 a) 
¢ 
The sum reduces in the present case to a single term, so that we get 
f(x, L’) = Cr’ w x/(2). (328 b) 


If M were an ordinary operator not involving the inner coordinates, 
then the coefficients of the transformation (329) for each admissible 
value of the perturbation energy H’—L'—K’' = AK’ (together with 
the latter) would be determined by the system of equations 

Y Mii-cr- = AK’c;,, 
where M];.;. are the matrix elements of M with respect to the unper- 
turbed functions. These equations remain valid in the present case 


provided the matrix elements of M are defined according to the general 
formula (326 b) which gives, in virtue of (328 a), 


My z- = | w*.(x)M(L', L’)wg(x) dex. 
Denoting this expression by M K’L,K’_” and dropping the bars over the 
L's, we get p Myy erty: = AK'cy,. (329) 


We shall not stop here to discuss these equations, since they are 

practically identical with those of the ordinary perturbation theory. 
It should be added in conclusion that the preceding theory can easily 

be generalized for non-stationary phenomena corresponding to an ex- 

plicit dependence of the energy operator H upen the time. So long 

as this dependence does not affect the operator L, it is sufficient to 

replace the characteristic value H’ of H in (325a) by the operator 

h @ 


hy = Ss the functions ¢,, being determined by the equation 
h @ 


sy yee = (ATL but p3 Myr 1; (329 a) 


392 PROBLEM OF MANY PARTICLES § 40 


40. Identical Particles (Electrons) and the Exclusion Principle 


Returning to elementary particles, we shall now take into account the 
restrictive condition which follows from the identity of all the electrons 
or all the protons and which is expressed by Pauli’s exclusion principle 
or by the Dirac antisymmetry principle for the wave functions ¢ 
describing the behaviour of a system of electrons or protons (see § 22, 
Part I). For the sake of simplicity we shall apply this principle to 
a system of electrons only, treating protons and atomic nuclei as fixed 
centres of force. Such a treatment can actually be applied with suffi- 
cient accuracy to many problems connected with the structure of atoms, 
molecules, and material bodies; for in view of the relatively large mass 
of the atomic nuclei—protons included—they can be dealt with to a 
certain approximation as fixed material] points, producing the external 
electrostatic (and also magnetostatic) field in which the electrons are 
supposed to move. 

We must, to begin with, check the validity of the Pauli principle in 
Dirac’s form—in the sense of its permanence in time—from the point 
of view of the generalized equation of motion, involving the spin 
coordinates, which has been established in the preceding chapter.t 

This equation can be written in the following form: 


nee Ciacci ns Pa Sty Pn Sub (%y Si s--+s En Sn) 


= ewe yLinstntis (880) 
i.e. as a system of 2” equations for the set of 2" wave functions ¢y, 
where x, and p, stand short for coordinate triplets z,, y,, z, and the 
momentum components p;,, Pry: Pye. The space coordinates of each 
particle, together with its spin coordinate, form a coordinate quad- 
ruplet; the same is true of the momenta, the momentum corresponding 
to the spin coordinate being replaced by a duplication of the latter, 
which gives to H its operator-matrix character. 

In view of the identity of all the electrons, H must be a symmetrical 
function with regard to the indices 1, 2,... distinguishing them. If, 
therefore, the wave function ys is symmetrical or antisymmetrical with 
regard to these indices—i.e. with regard to all the coordinate quad- 
ruplets—at some instant ¢, its derivative //ét, and consequently its 
value for the next (or preceding) instant, will be so too. The symmetri- 

¢ It should be remembered that the permanence of the antisymmetrical character 


of the wave function has been cstablished in Part I on the basis of the ordinary 
Schrédinger equation for a system of identical particles without spin. 


§ 40 IDENTICAL PARTICLES AND EXCLUSION PRINCIPLE 393 
cal or antisymmetrical character of % can be regarded therefore as 
a permanent property. The fact that for a system of electrons (or 
protons) antisymmetrical wave functions only must be used to in- 
terpret the experimental data has been discussed at length in § 22 of 
Part I. 

As the spin forces are very small compared with the electrostatic 
ones, a fairly good approximation (of ‘zero order’) can be obtained by 
totally neglecting them (as well as the magnetic forces of Biot and 
Savart, specified by the mutual kinetic energy 7”). 

The energy operator-matrix H;.- reduces in this case to the product 
of the ordinary Hamiltonian operator for the system of particles under 
consideration: on ee es 
with the unit matrix (32]1a). Limiting ourselves to solutions of the 
type p= P(x, f,,...,4, f,)e7*27* ", which correspond to a motion with 
a fixed energy K’, we thus get, instead of (330), 


(K—K’')p = 0. (330 a) 


This equation differs from that of the ordinary theory (not involving 
the spin) only by the fact that K is understood to contain as a factor 
the unit matrix and that y is to be regarded as a function both of the 
ordinary coordinates and of the spin coordinates ¢,,...,¢,. Since K docs 
not contain the latter—or more exactly the spin matrices o,, 69,...,¢,,— 
these matrices must commute with K and represent consequently con- 
stants of the motion. The characteristic values of their z-components 
Op, = 2m, = +1 can be considered accordingly as additional spin 
quantum numbers specifying 2” solutions of (330a), that is 2” de- 
generate states which belong to the same value of the energy K’. We 
shall distinguish these 2” states with the help of the indices m,,...,m,, 
writing m short for the whole set of them. It should be remembered 
that the product of m, by h/27 represents the projection of the spin 
of the kth electron on the z-axis. 

If we write ¢, = —4, +4 instead of 1 and 2 respectively (as was done 
before), we can define a set of 2" orthogonal and normal solutions of 
the equation (K—K’),. = 0 which belong to the same characteristic 
value of K by the formula 


Prm(Z; 4) = $x(Z)8 ne, (331) 


where Sn¢ = 5m,z, 5m,f, ++: Smale 18 the 2”-dimensional unit matrix equiva- 


lent to (321a) and ¢,(x) the normalized solution of the ordinary 
3505.6 3B 


394 PROBLEM OF MANY PARTICLES § 40 
Schrédinger equation (K— K’)¢, = Onot involving any spin coordinates. 
We have in fact, by the definition (331), 


J Phew dame AV = [EI Vine Warn") AV = Bury (331.8) 


This form of the solution of the Schrodinger equation with the spin 
coordinates taken into account cannot, however, be reconciled with the 
antisymmetry condition for the functions %, except when all the » spin 
quantum numbers m,,...,m,, have the same value (either 4 or —}$). In 
this case 5,,; is a symmetrical function of the spin coordinates, and in 
order to satisfy the antisymmetry condition we must define ¢ as an 
antisymmetrical function of all the n coordinate triplets 2,,...,2,. 

If some of the numbers m, have the value — } and others the value 
+4, the function % as defined by (331) will not be antisymmetrical, 
whatever the type of the space factor ¢. 

The spin factors 4,,, can be used, however, in this case to obtain 
somewhat more complicated spin functions «({) which are either sym- 
metrical with regard to all the variables ¢,,...,f,, or with regard to some 
of them, being in the latter case antisymmetrical with regard to definite 
pairs of these variables. 

A symmetrical spin function ¢«(¢) can be formed by permuting the 
variables ¢; and ¢, in those factors 6,,,,, and 5,,,,, for which m; 4 m, 
and adding the results. If instead of adding we subtract them from 
each other, we shall get a function antisymmetrical with regard to the 
pair of variables (Z;, ¢,). Putting for the sake of brevity 


Wi Sa) = (84.058 44,684, 844.4) = —MH(ie Se) (332) 
(0:5 Se) = (8-45, 944,¢ +9426 544,7,) = +O(Se Ss) 
we get for «({) an expression of the form 


€,;(¢) = u(t), C.)u(ls, C4) + U(Cei-a, Cos s(Coisay-e+s Cn)  (332a) 
where 0;(29;4:,---,¢,) 18 @ symmetrical function of the n—2z variables 
Csie1---» 6, formed by taking the product of a certain number j of func- 
tions of the type v(¢;, ¢;) and of n—2(t+j) simple functions 5,,,, with 
the same value m’ of m,, and summing such products for all non-trivial 
permutations of the variables (.;,,,...,0,: 
O;(Coisas--s bu) = > V(Coi+1 Coi+2) as O(Coises—1> Cai+2s)om't,y mei" Swt,: 
(332 b) 
The numbers 7 and j fully specify the spin functions ¢;,,(¢) for a fixed 
arrangement of the variables {,,...,¢,. By permuting the latter we can 
obtain other functions of the same symmetry type. 


§ 40 IDENTICAL PARTICLES AND EXCLUSION PRINCIPLE 395 

Before, however, proceeding to such permutations, let us multiply 
the function (332 a) by. a space factor ¢;(x) which we shall assume to 
be symmetrical with regard to the pairs of coordinate triplets (x,,2,), 
(3, %4),..., (Tgeey,%e;) and antisymmetrical with regard to the rest. The 
pee bilc}ey(2) (333) 
will obviously be antisymmetrical with regard to the pairs of coordinate 
quadruplets (2. ¢,%2, 02), (23. S32» Sa)s--+> (Tae—ay Sei—1» Tair Soi) and anti- 
symmetrical with regard to all the other coordinate quadruplets. It 
will have, however, no symmetry whatever with regard to permutations 
affecting the variables of different groups, corresponding, for example, 
to interchanges between the first and the third electron, or the first and 
the (27+1)th one. If we now apply such permutations (P,) to the 
function (333) and add the results, we can obtain a function 


$is(2, 6) = > PlOi(r)e;(2)] (333 a) 


which will be antisymmetrical with regard to all the electrons, i.e. all 
the coordinate quadruplets. Permutations of this class can hardly be 
defined explicitly for the general case (arbitrary values of i and 7). 
They can, however, be specified unambiguously by certain simple con- 
ditions which we shall not consider here. 

The antisymmetrical wave functions (333 a) can also be obtained by 
starting from ‘spinless’ functions of the type ¢,;(x) symmetrical with re- 
gard to: pairs of electrons and antisymmetrical with regard to j other 
pairs, while antisymmetrical with respect to all the other n—2(2+) 
electrons. The complementary spin factors «({) should reduce in this 
case to a product of 7 factors u, 7 factors v, and n—2(i1+ 7) = 2|m| factors 
5n’~- The permutations P;; which must be applied to the products 
$;;(x)e;;(¢) in order to obtain the functions 


$;;(2, Y= > Fildis(™)eis(0)], 


identical with those defined by (3334), will constitute a broader class 
than the permutations P;. In fact, they can be defined as the products 
of the latter and of the permutations which must be applied to the 
spin functions 
Ucoiea boise) ‘sa U(Cai+;)—2 £oi49) 8 m'ber «5, ae Sw'b, 
in order to obtain upon addition the symmetrical function (332 b). 
In constructing the functions (333 a) we have left out of account the con- 


dition that they must satisfy the ‘spinless’ Schrédinger condition. Now 
it is easily seen that this condition is fulfilled so long as it is fulfilled 


396 PROBLEM OF MANY PARTICLES § 40 
for the space factor ¢,(z) in the initially chosen function (333). Apply- 
ing to the equation A¢,(x) = K;¢,(x) any permutation P., we have 
indeed, since A is symmetrical with regard to all the electrons and K, 
ig @ pure number, 


P[K¢,(x)] = K[P;4,(x)] = KP; ¢,(x)]. 


This shows that if ¢,(z) is a characteristic function of the operator 
belonging to a certain characteristic value (energy-level) A,, then all 
the functions resulting from it by permuting the electrons will also be 
characteristic functions, belonging to the same energy-level. This being 
so, any linear combination of such functions will have the same pro- 
perty, which therefore will be shared by the unique combination (333 a) 
satisfying the antisymmetry condition (the factors Pje;;(¢)], which are 
equal either to -++1 or to 0, playing the role of ordinary coefficients 
with regard to the functions P{¢,(x)}). 

It remains to be seen whether the equation A¢ = K’¢ actually has 
solutions of the type ¢;, i.e. antisymmetrical with regard to all the 
nm electrons (7 == 0), or symmetrical with regard tu one pair (1, 2), and 
antisymmetrical with regard to the rest (¢ =: 1), or symmetrical with 
regard to two pairs [(1,2), (3,4)], and antisymmetrical with regard to 
the rest (2 == 2), and so on. A rigorous proof of this existence theorem 
is not easy and we shall not stop to give it. The following remarks are 
worth mentioning, however, in this connexion: 

1. The functions ¢,; defined above (or their linear combinations) are 
not the only characteristic functions of a symmetrical operator K; the 
latter has besides, a number of charactcristic functions with an entirely 
different symmetry character—for instance, symmetrical with regard 
to all the coordinate triplets or antisymmetrical with regard to two 
or three of them, and symmetrical with regard to the rest, and so on. 
Such solutions, although they exist mathematically, are non-existent 
physically, i.e. they do not correspond to any real phenomenon, for 
they cannot provide a basis for constructing functions antisymmetrical 
with regard to all the coordinate quadruplets 2,, ¢,. The fact that such 
a basis is provided only by functions of the type ¢,(x) is a consequence 
of the two-valuedness of the spin quantum numbers m, of the individual 
electrons, this two-valuedness determining the symmetry type of the 
‘spin-factors’ «({) and thence indirectly the symmetry type of the 
associated space-factors ¢(z). 

2. The functions ¢,(x) (or their linear combinations) corresponding 
to different values of 7 belong in general to different characteristic 


§ 40 IDENTICAL PARTICLES AND EXCLUSION PRINCIPLE 397 
values K, of the energy operator. They can be introduced as ‘non- 
combining’ solutions of the equation (x + Ls 5)$ = 0 in that case 
also when K contains the time explicitly (i.e. when the electrons are 
supposed to move under the influence of a variable field of some external 
origin). In this case the symmetry character of ¢ remains a permanent 
property, if no difference is made between various linear combinations 
of the functions P{¢,(x)] with the same value of i, the permanence of 
the antisymmetry character of ¢, being a particular case of this theorem 
(the latter holds likewise for a number of solutions belonging to other 
symmetry classes, not realized in nature). 

It will be convenient in the sequel to replace the numbers i and j, 
which specify the functions (333) or (3334) by two other numbers, 


8 = }(n—2i) = 4n—i, (334) 
and m = . m, = +(4n—1—)). (334 a) 
k=1 


The latter can obviously be interpreted as the component of the result- 
ing spin angular momentum of all the electrons along the z-axis (in 
h/2m units); in fact it is equal to the algebraic sum of the characteristic 
values of the matrices }o,, for the individual electrons. For a given 
value of s,m can assume 2s+1 values differing from each other by 1 
and lying between +s and —s. This circumstance suggests the inter- 
pretation of s as the magnitude of the vector specifying the resulting spin 
of all the electrons (irrespective of its direction). The characteristic 
value of the square of this total spin is equal to the product of (h/27)? 
with s(s+ 1)—just as in the case of the resulting ‘orbital’ momentum, 
defined by the number j (see § 37). 

The above interpretation of the number s is also supported by the 
fact that its maximum value is equal to 4n, which corresponds to the 
same direction of the spin vectors o, of the separate electrons. It 
thus appears that the resulting spin associated with a given solution ¢; 
of the ‘spinless’ Schridinger equation is equal to one-half of the number 
of electrons with regard to which this,function is antisymmetrical. 

We shall now consider, for the sake of illustration, the special cases 
of systems consisting of two and three electrons, a helium and a lithium 
atom, say. In the first case we get functions ¢,(x) of two types only, 
namely, the antisymmetrical one ¢)(x) = ¢9(2,,2,) and the symmetri- 
cal ¢,(x) = ¢,(z,,2,) (following Heitler and London, we introduce lines 
under or over the neighbouring variables, to indicate the antisym- 


398 PROBLEM OF MANY PARTICLES § 40 
metrical or symmetrical character of the wave function with regard 
to these variables). Taking further the four combinations of the indi- 
vidual spin quantum numbers m, and m,, namely, (— 4, —4), (—4, +4), 
(+4, —4), (+4, +4), we can form three symmetrical spin functions, 


(E100) = 847 S-ar Sac Sgt Par Sac B40, 5h2) 
and one antisymmetrical 


(04, $2) = 8-42, 842,847, 94.4, 
The products of the former with the antisymmetrical space function 
$o(X},%_) define three states, corresponding to the same resulting spin 
8 = 1 (parallel orientation of the two electrons) and to the values 
m = —1,0,+1 of its projection on the z-axis, whereas the product of 
u(fy,¢2) with $,(x,, 2) defines a single state corresponding to s = 0 and 
m = 0 (‘anti-parallel’ orientations of the spins). 

In the case of three electrons we must distinguish likewise two types 
of ‘spinless’ functions, namely, those antisymmetrical with regard to 
regard to two of them, ¢,(x) == $}(2), Z2, 23), say (the third electron, being 
alone, does not require any specific condition with respect to symmetry). 

The functions of the first type must be combined with a symmetrical 
spin factor ¢({,,¢,¢,) which can be obtained either in thé form 
mt, omt, 8 
if m, = mM, = mM, = m'’ = +3 (> m, = +3), or in the form 

€ = (C1, Co) nee, + O(L3; b1)8 inet, + O(Sar S3)8 met,» 
if one of the numbers m, is different from the two others (> m, = +4). 
We thus get a ‘quadruplet’, i.e. four states with the same s = § and con- 
sequently with the same value of the energy K = Ko, which are dis- 
tinguished from each other by the values of the resulting ‘axial’ spin 
numbers m = —3, —, }, 3. 

The functions of the second type, ¢,(%1,%2,%3), must be- combined 

with spin factors of the form 

€(Cy, Se, 9) = U(b3, oe) 8 mt,» 
and summed over the cyclic permutations of all the three electrons, 
giving two antisymmetrical functions, 


p(x, 0) = (24,22, ay )u(Ly, b2)8mz, +$3(Zg 21, q)U(Lg, 04) 8 mg, + 
+$4(2%g, Xs, X;)U(Lg, Sg) n'y, 
for two different values of m’; the states defined by them belong to the 


e=35 ml, °m'ty 


§ 40 IDENTICAL PARTICLES AND EXCLUSION PRINCIPLE 399 
same value } of s and to the same energy K = K,, forming what is 
called a ‘spin doublet’ of a similar type to that for a single electron. 
The antisymmetrical character of the functions (2, ) is clearly seen 
from the fact that if two electrons, the first and the second, say, are 
interchanged, the first term changes its sign, whereas the second and 
third are transformed into each other with opposite signs. It should 
be mentioned that the normal state of a lithium atom, constituted by 
two equivalent inner electrons, forming its ‘core’, and one ‘valence’ 
electron, must be described by a wave function of the above type. 


Vill 


REDUCTION OF THE PROBLEM OF A SYSTEM OF 
IDENTICAL PARTICLES TO THAT OF A SINGLE PARTICLE 


41. Perturbation Theory of a System of Spinless Electrons and 
the Exchange Degeneracy 

Further progress in the study of the problem of many electrons can be 
achieved only if we describe their motion in a way similar to that used 
in Bohr’s theory of complex atoms, namely, by assigning to each 
electron an individual state of motion in a given field of force. The 
mutual action of the electrons can be partially accounted for by intro- 
ducing some constants like the screening constants, in the definition of 
the appropriate field of force for each electrun, or by using the same 
suitably chosen field of force for all of them—a self-consistent field, for 
example (see below). The problem of the motion of the whole system is 
thus reduced to that of the motion of the separate particles constituting 
it and to the determination of the effective external field which can 
approximately represent their mutual action. Inasmuch as this mutual 
action is accounted for inexactly, we can obtain a better approximation 
by treating it, or that part of it which was not included to begin with 
in the effective field of force, as a small perturbation, and approach 
the.exact solution by the methods of the perturbation theory, starting 
with the solution which corresponds to a distribution of the electrons 
between various individual states of motion (or ‘orbits’). 

A characteristic distinction between Bohr’s theory and the new 
quantum theory in connexion with this perturbation problem consists 
in the fact that the electrons must be interchanged between all the individual 
orbits in such a way as to be completely stripped of their individuality. 
This result which is expressed by the symmetry principle for the 
probability density yb* or the antisymmetry principle for the proba- 
bility amplitude ¥ can be shown to be in harmony with the principles of 
the perturbation theory applied to the problem of a system of identical 
particles. 

The wave function ¢ describing their motion can be represented to 
begin with as the product of the functions 5, (2), po(%q),..-, %_(X,) describ- 
ing the behaviour of the individual electrons in the given external field 


of foros. Putting f(z) = pals Walta)--Paltn) (335) 
and denoting by P¢ the function into which ¢ is transformed when 


§ 41 SYSTEM OF SPINLESS ELECTRONS 401 
the permutation P is applied to the electrons, we can represent the 
general solution of our undisturbed problem, belonging to the same 
energy as ¢(x) by the expression 


x(t) = p Cp Pd, (335 a) 


where Cp are arbitrary coefficients, the sum being extended over all 
the possible permutations, or at least over the ‘effectively different’ 
ones, i.e. such as lead to different functions P¢. 

If all the n individual wave functions y,, o,..., ¥,, are different, every 
one of the n! possible permutations P will be associated with a specific 
function P¢. In the contrary case the permutations P can be sub- 
divided into separate sets of equivalent permutations, which correspond 
to identical functions P¢, and in writing down (335a) we shall have 
to consider only one representative of each set. 

We shall assume for the sake of simplicity that apart from this 
‘exchange degeneracy’, arising from the possibility of interchanging 
the electrons between different individual states without altering the 
total energy, no other type of degeneracy need be considered. 

We shall disregard in this section the spin effects and treat the 
electrons as spinless particles, using for the determination of their 
motion the ordinary Schrédinger theory. We shall leave aside further- 
more the question as to the symmetry of the functions x(x) and shall 
try to determine the coefficients Cp by which they are defined in such 
a way as to ensure the approximate validity of the expression (335 a) 
when the perturbing forces (i.e. the mutual action of the electrons or 
the neglected part of this mutual action) are taken into account. In 
this case the function (335 a) is said to be ‘stabilized’ for the perturba- 
tion. It is meant by this that if the approximation is pushed further, 
the coefficients C', will suffer but a slight variation. This question has 
been considered in its most general form in the perturbation theory of 
degenerate systems. As has been shown there, the degenerate set of 
states specified by the functions P¢ gives rise to the same number 
of states belonging in general to different energy-levels H’ and specified 
by the values of the coefficients Cs which satisfy the system of 


equations > HpoCo = H'Cp, (336) 
where Hp g are the matrix elements of the total energy with regard to 
the approximate functions P¢ and Q¢: 

Hp = | Po*HQ¢aV, (336 a) 


Q denoting, as well as P, a permutation of the electrons. 
3595.6 3F 


402 SYSTEM OF IDENTICAL PARTICLES § 41 

In writing down the equations (336) we are tacitly assuming that the 
different functions P¢ are mutually orthogonal. This assumption is 
easily seen to be verified if the functions y,(z,,) describing the different 
individual states are orthogonal with regard to each other. Now the 
mutual orthogonality of the individual functions is automatically 
secured if they represent different stationary states of an electron in 
a given external field—the same for all the n electrons. In many actual 
problems it is more convenient, however, to assign to each electron 
a specific field of force (for instance, a Coulomb field, characterized 
by a specific value of the screening constant in the problem of the 
distribution of electrons in a heavy atom), in which case the individual 
wave functions can no longer be considered as mutually orthogonal. 

The equations (336) must be replaced in this case according to (61), 
§ 9, by the following ones: 


2 (Hp y—H'Jp9)Cg = 9, (337) 
where Jpg = | P$*Q¢ dV. (337 a) 


The value of this integral must obviously remain unaltered if the 
integration variables are replaced by any other ones (which amounts 
simply to a change of notation). We can, in particular, interchange 
them in a manner corresponding to an arbitrary permutation FR of the 
electrons. The functions P¢* and Q¢ will be replaced accordingly by 
RP¢* and RQ¢, so that we shall get 


Jpg = { RPS*RQ$ dV = Ipp.no- 
It should be noticed that the permutation R must not be applied to 
the functions ¢* and ¢, the result 
[ PR$*QR}S dV = Ippon 
being in general quite different from the preceding one. 

If, in particular, 2 is identified with the reciprocal of Q (R = Q-), 
we. ger Jpg Joke, (338) 
where J, is an abbreviation for J,,, IJ denoting the identical per- 
mutation ([¢ = ¢). 

We get likewise, because of the symmetry of the energy operator H 
with regard to all the electrons, 

Apo = Hrpre 
and in particular Hpg = Hg-1p, (338 a) 
H,, being an abbreviation for Hz ;. 


§ 41 SYSTEM OF SPINLESS ELECTRONS 403 
The relations Jo p = Ji.g and Hy p = Ho can be written accord- 
ingly in the following form: 

Sri = JR; Apa = Hh, (338 b) 
where R = Q-'P and R-! = P-!Q. We thus see that the number of 
different matrix clements Hpg and Jpg is actually reduced to the 
number, g say, of different states P¢ instead of being equal to its 
square g?. 

The equations (337) can be rewritten as follows: 
p2 (Hp—H'Jp)Cprp- = 9, (339) 


the summation over all the permutations RF being obviously equivalent 
to the original summation over the permutations Q, with a fixed per- 
mutation P, the latter specifying each of the g equations forming our 
system. The perturbed values of the energy H’ are determined as the 
roots of the determinantal equation 

\Ho-1p—H'J g-rp| = 0, (339 b) 
which expresses the condition of their compatibility. 

Two types of solution of our perturbation problem are immediately 
obtained from the equations (339)—namely, those which correspond 
to the symmetrical and to the antisymmetrical functions y. In the 
former case all the coefficients Cg are equal, so that they cancel out 
and the equations (339) reduce to the single equation 


p (H,—H’ r) = 0, 
which serves for the determination of the energy 


een — > H;,/ ~ Jp: (340) 


In the latter case the coefficients Cg are defined by the formula 
Co = «gC, where eg = +1 for even permutations (equivalent to an 


even number of transpositions) and = —1 for odd ones. Since in this 
case Crp = epe,C, the g equations (339) again reduce to the single 
equation ps e,(Hp—H'J x) = 9, 
2 
whence H’ satteym = > €r Op / p End R: (340 a) 
rr : 


One might be tempted to look for more general solutions of (339) by 
assuming that Cpg = const. Cp Cg, or Cp = const.e'%”. It can easily 
be shown, however, in the same way as in Part I, § 22, that this assump- 
tion leads to symmetrical and antisymmetrical functions only. The 
symmetry properties of all the other solutions can be determined by 
the following method due to Dirac. 


404 SYSTEM OF IDENTICAL PARTICLES § 41 

According to Dirac, permutations can be dealt with in exactly the 
same way as ordinary linear operators which serve to represent various 
physical quantities. They can, in fact, be multiplied by each other, the 
product being in general non-commutative, i.e. depending upon the 
order of the factors, but satisfying the associative law (just as in 
the case of differential or matrix operators investigated hitherto). 
Jt is possible further to define the sum. of two or more permutations as 
an operator, which without being itself a permutation is equivalent to 
them in the sense of the distributive law: 


(P,+P,)F = PjF+P,F, 


where F denotes any other operator or function. 
To each permutation 12 
5. Disc 
P= : 
aes 
there corresponds the reciprocal permutation 
Mai Mesixiy le 
P-1 = ede DR a 
i. De sii a, 
whose product with P, irrespective of the order of the two factors, is 
equal to 1, i.e. is equivalent to the ‘identical’ permutation 


= 12...” 
12...” 


KEvery permutation P can be represented as a product of ‘cyclic’ per- 
mutations, of the type 
1234 
(o 341 
where each element in the brackets ( ) is replaced by the next, the last 
one being replaced by the first. The different cycles into which P is 
thus factorized must have no common elements; they can be therefore 
commuted with each other without changing the result. We have 
for example, 
123456789 
(n 564239186 
the two-element cycles (1,7) and (6,9) being simply transpositions (i.e. 
interchanges of two elements), while the one-element cycle (8) denotes 
that the corresponding element is not affected by the permutation 
considered. 
Permutations which can be factorized into the same number of cycles 
with the same number of elements (which may be different for different 


| = (1, 2,3, 4), 


= (1,7)(2, 5,3, 4)(6, 9)(8), 


§ 41 SYSTEM OF SPINLESS ELECTRONS 405 
permutations) are called ‘similar’ and form a ‘class’ specified by the 
‘partition’ of the number 7 into summands giving the number of ele- 
ments in each cycle. The partition for the above permutation is 

nm = 1424244. 
Similar permutations P and Q can thus be obtained from each other 
by permuting the elements appearing in the cycles of one of them. 
Denoting by R the permutation which must be carried out in the 
cycles of P in order to obtain Q, we get 

Q == RPR-', 

The factor R~' accounts for the fact that the permutation R should 
not affect the operator or function to which P or Q is supposed to 
be applied (RPF would be equivalent to applying the permutation & 
both to P and to F). 

Since every permutation P commutes with the energy operator H (H 
being symmetrical with respect to all the electrons), it can be treated 
as a constant of the motion. The fact that the different permutations 
do not in general commute with each other shows that it is impossible 
to assign simultaneously definite valucs to all these constants. It is 
possible, however, to combine them linearly into a set of commutable 
operators, which can be constructed by adding together all the permu- 
tations belonging to the same class. With a fixed P and a variable R 
cach permutation Q = RPR-! will be obtained several times—namcly, 
n!/n,, where 2, is the number of different permutations in the class 
under consideration. The sum of all such permutations, or their 


‘average’ _  y 
P= — bz RPR, 


n 


will obviously commute with all the permutations. We have in fact 
or ] 
PET = — > Tana 
n! y2 


or putting 7R = Sand R17T-1 = S-, 
om-1 _. ] 1. Pp 
TPT — —\ SPS4 = P 
Ss 
(since for a fixed T' and a variable FR the product 7'R varies over the 
same range as R). Hence TP = PT. It follows in particular that the 
operators P, referring to different classes (k = 1, 2,...) commute with 
each other. Since, moreover, they commute with the energy operator 
H, they can be considered as defining a set of independent constants 


406 SYSTEM OF IDENTICAL PARTICLES § 41 
of the motion whose characteristic values Pi, can be determined simul- 
taneously and can serve, together with the characteristic values of the 
energy H’, to specify the stationary states of the system. 

The characteristic values of the operators P are obviously wholly 
independent of the form of the energy operator (so long as it is sym- 
metrical between all the electrons). They must be connected therefore 
with the symmetry properties of the wave functions x, which belong 
to them and can serve for the classification of the latter. 

It should be noticed that the operators P preserve their role of con- 
stants of the motion in the general case of an energy operator containing 
the time explicitly. This means that if the wave function x satisfying 


0 


Schrédinger’s equation Seay == Hy has at the initial moment 
2m OL 


t:= 0 a definite symmetry type, specified by certain characteristic 
values of the operators P’, it will maintain the same symmetry type at 
any other time. The same results can be expressed by saying that the 
stationary states of an unperturbed system belonging to different charac- 
teristic values of the permutation operators P do not combine with 
each other under any perturbation (symmetrical in all the electrons). 

The simplest examples of this theorem are provided by the sym- 
metrical and the antisymmetrical wave functions. The charactcristic 
values of the P are equal to +1 for the former and to +1 for the latter 
(+1 for even permutations and —1 for odd ones). 

So long as the spin effects are left out of account we have to consider 
symmetrical and antisymmetrical functions only; if, however, the spin 
effect is allowed for, spinless functions of a more complicated character 
have to be admitted; to each set of characteristic P-values there 
corresponds in general not one but many wave functions of the same 
symmetry type (cf. Part I, § 22). If, moreover, the spin forces are taken 
into account (as a small perturbation), the states corresponding to 
different P-values will combine with each other. We thus get rather 
complicated results which can, however, be reduced to the original 
simple form if the spin coordinates are introduced in the definition of 
the wave functions on the same footing as the geometrical ones. 

If the electrons are associated with different individual states specified 
by mutually orthogonal wave functions, the set of functions P¢ can be 
replaced by the set P,¢ obtained from ¢ = y,(x,)y2(x2)...b,(%,) by 
applying the different permutations P not to the arguments of the 
functions ~ but to their indices, that 1s, by permuting not the electrons 
between the given states, but on the contrary the different states between the 


§ 4) SYSTEM OF SPINLESS ELECTRONS 407 
electrons. Since by applying the same permutation P both to the argu- 
ments and to the indices, we obviously do not change the resulting 
factorized function, we can put 
P,P, = P,P, = 1, 
where the suffix z has been added to indicate explicitly that P is applied 
to the electrons. We thus see that P, plays the same role as the 
reciprocal of P. and vice versa. 

Taking the matrix elements of the energy with respect to the new 
functions He, and remembering that they are invariant with regard 
to any permutations F of the electrons (i.e. of the integration variables), 
we have 


Hy = [ Pyd*HQ,o dV = R,{ Pyd*HQygaV 
- | P,P, d*HQ, R,¢ av, 


(since we must first permute the integration variables in ¢* and ¢ and 
thereafter only carry out the permutations Fy, Q, of the indices). The 
functions 2, ¢* and R,¢ can further be replaced by Ry'¢* and Rj’, 
the permutation R, applied to the arguments of any factorized func- 
tion ¢ being equivalent to the reciprocal permutation applied to the 
indices. We thus get 


HY, = | P, Ry'd*HQy Ry'p dV = Hh-1 ore. (341) 
With R = Q this reduces to 
HY, = HD -1, (341 a) 


where HY) is an abbreviation for HY. The difference between this 
result and the expression (338a) for the matrix element of H with 
respect to the original functions P,¢ and Q,¢ consists only in the order 
in which the two permutations P and Q-! must be multiplied by each 
other. We shall presently see that thanks to this difference it is 
possible to reduce our perturbation problem to a simpler form, 
corresponding to the replacement of the energy operator H by the 
equivalent ‘permutation operator’ 


W=> HW Ry, (342) 


The fact that the two operators are equivalent so far as the first 
approximation equations (336) are concerned is proved by comparing 
the matrix elements of W and H with respect to the functions P, ¢. 
We have, namely, 


Wo = Po HY f Py o*R, Q,¢ 40, 


408 SYSTEM OF IDENTICAL PARTICLES § 41 
which in view of the orthogonality and normalizing conditions for the 
functions PF, ¢ reduces to 


WY, =H. (RQ=P), 


that is, to HY, according to (341). 

A similar result cannot be obtained with the wave functions P, y which 
have been used before, for with W defined by the formula W = >} A, R, 
we get Wp. = Apg-i. There can, however, be no correspondence 
between this expression and the matrix element Hpg = Hg-ip for 
the two permutations PQ-! and Q-'P are in general quite different. 

The form of the energy operator H has been left hitherto quite 
arbitrary (apart from its symmetry with respect to all the electrons). 
Now in all actual problems H can be written down in the form 


H = p3 E(x;,p;)+ > F(x;,2,), (343) 


where the first term represents the sum of the energies of the separate 
electrons, supposed to move independently, while the second term is 
equa! to their interaction energy, so that F'(z,,x,) = acs) (r being 
i» 
the distance apart between the ith and the kth electrons).—It should 
be emphasized that in writing down the expression (343) we must not 
consider the energy E(z,,p;) as corresponding to the approximate de- 
scription of the motion by means of the individual wave function 
y,(z;). The latter can correspond to a somewhat different energy 
operator E;(z;,p;) involving some additional terms which serve to 
account in a simplified way for the mutual action of the ith electron 
with the rest—by an adequately chosen value of the ‘screening con- 
stant’ in the case of a complex atom, or by some type of ‘self-con- 
sistent’ field. The difference 


S = H— ¥ E(x p,) (343 a) 


can be defined as the perturbation energy. In order to obtain by our 
perturbation method a good approximation to the truth we must 
adequately determine the ‘effective’ energy operators E; for the in- 
dividual electrons in such a way that the matrix elements of the per- 
turbation energy S should be as small as possible. We shall come back 
to this question in § 43. We are interested here only in the specialization 
of our general theory for the actual case of an energy operator of the 
form (343). 

We shall assume for the sake of simplicity the functions , and 


§ 41 SYSTEM OF SPINLESS ELECTRONS 409 
consequently P¢ to be mutually orthogonal (and of course normalized 
to 1). The matrix element L,, of the energy E(z;,p,) defined by the 
general formula 


E,= | Rg*E(x;,p,¢dX (aX = dz,...dz,) 


is then easily seen to vanish for all the permutations R except the 
identica] one, in which case it reduces to 


oh oH = | PT h(2;, pip; dx;, (344) 


that is, to the average value of the energy of the ith electron with 
regard to the external field alone for the state of motion which was 
initially assigned to it. It should be kept in mind that this motion, 
inasmuch as it is described by the approximate energy operator 
E,(x;,p;) which contains some additional external field morc or less 
equivalent to the mutual action of the ith electron with the rest, differs 
from the motion described by the operator E(.x;,p;), and that accord- 
ingly the energy H, is in general different from the characteristic value 
E’, of the energy corresponding to the wave function y,. 
Taking the matrix element F, of the interaction energy F(z,,2,), 


F, = | RO*P(x,,2,)¢ aV, 


we easily see that it does not vanish in two cases only, namely, in the 
case of the identical permutation, when it reduces to 


Fi, = ff DE (xi PE (Ly) F(X 4, Ly) (X;),(2,) dx; dx,,, (344 a) 


and in that of a transposition R = T;, involving the interchange be- 
tween the ith and kth electrons. We shall denote its value for this case 
by G;,, where 


Au = ff PW F eo ze Wleele,) dx, de, (344) 


All the other matrix elements of E and F, and consequently all the 
coefficients H, for such permutations R which are different from the 
identical permutation or from a transposition vanish. 

It should be noted that we obtain the same expressions for the matrix 
elements of E and F with respect to the wave functions R,¢. The 
identification of the integration variables in (344a) and (344b) with 
the coordinates of the ith and the kth electrons is irrelevant for the 
value of F,, and G,,, this value being determined by the states to which 


the two electrons are referred, and not by the individuality of these 
3595.6 3a 


410 SYSTEM OF IDENTICAL PARTICLES § 41 
electrons. We could therefore write 


Fy = Fh = [f phe WE) Fe’, 2" Wale Wyle”) de'da” 
and Gy = GH, = [ff oh’ WE eFax" Wile" Wale’) dar'da”, 
leaving the indices of the two electrons unspecified. 
The permutation operator W is thus reduced in all actual problems 


to the relatively simple form 
Wo Wo+ > iw Ty. (345) 
i< 


where We = > E+ > Fy, (345 a) 
t 1<k 


can be defined as the approximate value of the energy of the system 
under consideration, the second term in (343) representing the operator 
of the ‘exchange’ energy. 


42. Introduction of the Spin Coordinates and Solution of the 

Perturbation Problem with Antisymmetrical Wave Functions 
The results of the preceding section cannot be directly applied to the 
general problem of the motion of a system of electrons, for this implies 
the introduction of the spin coordinates which have been ignored 
hitherto. Even if we neglect the spin forces—which we shall always do 
in the sequel—we must take into account the spin coordinates and the 
spin quantum numbers in order to set up the antisymmetrical wave 
functions which describe a system of electrons. 

We shall consider here the problem of the approximate determination 
of the antisymmetrical wave functions with spin, which belong to a 
spinless energy operator H, with the help of the individual wave func- 
tions y¥,(z, €) describing the motion of the separate electrons in a given 
external field (€ denotes the additional spin coordinate and the index 
2 is supposed to contain the spin quantum number). 

This problem admits at first sight a simple and unique solution 
expressed by the determinant 


l $1(2, é;) . : ° Pi(Zp, &) 
= uM xh ° e . . . . . . . 
" ) Py (2, £1) * : : Pr(Tn; xu) 
since no other wave functions but the antisymmetrical one need be 
taken into account in connexion with the exchange phenomenon. 
The simplification with regard to the exchange degeneracy intro- 


duced by the antisymmetry condition is, however, balanced by the addi- 
tional degeneracy, due to the possibility of assigning to each electron two 


(346) 


§ 42 INTRODUCTION OF SPIN COORDINATES 411 
different spin-states connected with the same type of orbital motion and 
corresponding to the same value of the energy. We thus get for the 
whole system of n electrons, distributed between n ‘orbits’, i.e. spinless 
states, which can be specified by certain functions of the geometrical 
coordinates alone ¥,(z), ¥,(2),..., %,(z), a degenerate set of 2” states 
differing from each other by the spin quantum numbers ™, mz,..., m,, 
associated with each spinless state. 

The individual states with spin can be described by the functions 

$;( (x, §)= Pi(X)8,, {> (346 a) 
where m and é assume the values 4 and —}, 5,,¢ being equal to 1 for 
£ = m, and to 0 for € 4 m (it should be remembered that m denotes 
the characteristic value of the component of the spin-matrix o along 
some fixed axis). 

The spinless functions %;(z) need not be all different; they can occur 
in pairs, under the condition that the associated spin quantum numbers 
m, are different. Instead of four degenerate states we get for each such 
pair only two, so that the total number of degenerate states of the 
whole system is equal to 2+” = g, where n’ is the number of singly 
occupied spinless states and n” the number of doubly occupied spinless 
states (n = n’+2n”). 

In the absence of any other degencracy except the spin one [and the 
exchange degeneracy which is taken care of by using as zero approxima- 
tion the antisymmetrical function (346)], the problem of determining 
to the first approximation the wave functions with spin x(x, €) corre- 
sponding to the spinless energy operator H can be solved by defining 
these functions as linear combinations of g functions of the type (346), 


9 
x(x, €) = py C,®,, (347) 
where the coefficients C, satisfy the system of g equations, 
> (H.g—H'J,p)Cg = () (x = 1, » ae he (347 a) 


under the compatibility condition 
|Hp—H'J,g| = 0 (347 b) 
which serves for the determination of the energy-levels H’. 
The matrix elements H,g and J,, must be defined here by the 


expressions Hap = » i} D* 11D, dX 


(348) 
Top = & | O3 0p dX 


412 SYSTEM OF IDENTICAL PARTICLES § 42 
where > denotes a summation over the spin coordinates of all the 


é 
n electrons involved in the functions ®. 
Taking into account the relation 


~ Since: One’ — Oia (348 a) 


which follows from the definition of the symbols 5 (where é’ refers 
to one particular electron), we can easily find that the matrix elements 
(348) can be. different from zero only if the functions ®, and gz are 
associated with the same value of the resulting spin component 


M == > m;. (348 b) 
mm 


In fact, Hy, wd J, can be expressed as a sum of terms each involving 
a product of x factors of the type (348a). Now unless the two states 
a and £ are associated with an equal number of spins pointing in the 
same direction, i.e. specified by spin quantum numbers m, having the 
same value (4 or —}), one at least of these m factors will vanish in 
each such term. 

We thus see that the functions ®; can be divided into a number of 
non-combining groups belonging to different characteristic values of the 
total spin component m of all the electrons along a certain axis, z say. 
This result is a direct corollary of the fact that the spinless energy 
operator H commutes with each of the spin matrices o,; and consequently 
with their sum 


n 
Oo; >= bs G;;- 
w=1 


Now this means that the matrix of H is diagonal with respect to m. 
We have in fact (leaving other variables out of account) 


(Ho,—o, A) ime i > (inn Fzmm" — Fzm'm Ann) = F,yym-(m" —m’) = 0, 
m 


+ ~ 


whence it follows that H.,,,,,- -- 0 unless mm’ ..- m”. 

The subdivision of the function ® into groups belonging to the same 
value of m greatly simplifies the perturbation problem under considera- 
tion, for the g equations (3474) are split up hereby into a number of 
separate systems, containing coefficients which refer to functions ® of 
the same group only. The function y(z,&) stabilized for the perturba- 
tion will belong accordingly to a definite characteristic value m of o, 
specifying the corresponding group. The equations (347), (3474), and 
(347 b) will be understood in the sequel to refer to one particular group 
of g states with the same value of m. 


§ 42 INTRODUCTION OF 8PIN COORDINATES 413 

If all the spinless states y,,...,%,, are different, the number g is given 
by the formula [C%, is the usual binomial coefficient n!/{r!(n—r)!}] 

g(m) = Cnet, (349) 
In fact the number of ways in which n, positive and n_ negative spins 
can be associated with the n different orbits is obviously equal to 
Cr+ = Ct-, which reduces to (349) since 

m = t(n,—n_), (349 a) 
that is, n, = n+2m. (349 b) 
The sum > g(m) taken for all values of m from —}n to }n is equal to 

> ost = 2", as of course it should be. 

N+ = 

The g(m) functions ©, forming a certain group can be obtained from 
one of them ® by permuting the spin quantum numbers m, mg,..., M, 
associated with the separate orbits between the latter, with the con- 
dition that identical orbits—if present—should always be associated 
with opposite spins. Such permutations P must be distinguished from 
those which we have considered before and which referred either to the 
distribution of the electrons between the (spinless) states or of the states 
between the electrons. 

Just as before, however, it can be concluded from this circumstance 
that the number of different matrix elements Hg and J,g is reduced 
from g°, to g,,. We shall not stop to investigate this question, for, as 
has been shown by Slater, all we need to know are the diagonal elements 
of the energy, from which the perturbed energy-levels can easily be 
computed without directly solving the perturbation equations (347 a). 

The diagonal elements of H are easily seen to have the same value, 
H(m) say, for all the g(m) functions ®. If the individual wave functions 
(with spin) #,,...,%, are orthogonal and normalized to 1, i.e. if Jig = 34g 
(which we shall assume to be the case), then according to (347b) the 
sum of the diagonal elements of H, that is, the product H(m)g(m), 
must be equal to the sum of the g(m) characteristic values of H 
belonging to m which are the roots of equation (347b). Now whereas 
m, being the characteristic values of the projection o, of the resulting 
spin o on the direction of the arbitrarily chosen z-axis, depends upon 
the choice of its direction in space, the characteristic values of the 
energy must obviously be independent of the choice of this direction, 
being in fact invariant with respect to the rotations of the coordinate 
axes. They must be determined therefore by the characteristic values 
8 of the resulting spin itself, which are also invariant both with respect 


414 SYSTEM OF IDENTICAL PARTICLES § 42 
to rotations of the coordinate axes and to the permutations of the 
electrons. 

So long as the forces due to the spin of the electrons (including the 
effects of their orientation in an external magnetic field) are neglected, 
all those states which belong to the same value of the resulting spin 
form a degenerate set, so that their energy is wholly determined by s. 
The number of such states f(s) and their energy H(s) can easily be 
calculated from g(m) and H(m) if we take into account the fact that, 
for a given m, s can assume the following values: 


s = |m|, |m|+1...., $n. 
Subdividing all the states belonging to a definite m into groups specified 
by different values of s, we thus get 


aim) = ¥ fe (350) 

and a(m)H(m) = & fle). (3502) 
The latter equation can be eerie in the form 

H(m) = ao (350 b) 


fe) 
s=]m] 
which expresses the fact that the diagonal elements of the energy H 
are equal to the average value of the energy for all the states 
associated with the corresponding value of m. 

From (350) and (350 a) we obtain 


—f(s) = 9(8+1)—g(s) = Ag(s) (351) 
and == —f(e)H(s) = g(s+1)H(e+1)—g(s)H(s) = Alg(s)H(e)] (351 8) 
whence H(s) = ae) (351 b) 


Since g(s) is known, being determined by the equation (349) in the case 
of n different orbits, our problem reduces to the calculation of a diagonal 
element of H for a given value of m (= 3). 

We shall take for the operator H the expression (343), i.e. 


f= 2 E(x;, p;) +33 F(x4, Xx), 


which is the only one occurring in practice. 
We shall further write one of the functions ® defined by the deter- 


§ 42 INTRODUCTION OF SPIN COORDINATES 415 
minant (346) in the form 


® = Sly 2, <P Pe#loFe(E), : (352) 
2 


where (x) = ,(x,)W_(22)...,(2,,) is the product of the spinless func- 
tions and 8,,() = 84.6, 8n,2,--Sing, the product of the corresponding 
spin factors, «, being equal to 4-1 or 1 for permutations of the even 
and odd type respectively (the permutations P, refer to the geometrical] 
coordinates and P; to the spin coordinates of the clectrons). 

Let us consider the case when all the » orbits yy, %,...,y, are 
different (and orthogonal to each other). The expression 


H = yf Orne dX = = > Dd epeg { P,dHQ,¢dX ¥ Pe8(E)\Q¢ 5(€), 
nm P@ é ; 

(352a) 

which defines the diagonal matrix element of // with respect to the 
state ® (or the corresponding average value) is then easily simplified. 
The integral Hp, = § P,¢d/Q,¢dX does not vanish, as we know, 
either when the permutations P and Q are identical (J? = Q) or when 
they differ by a transposition 7;, of any two electrons (Q = PT),). It 
reduces in the first case to H, = W°= > # +t PPE? and in the 


second to G,, [cf. equations (344), (344a), (344b), and (345a) of the 
preceding section]. 
We have further, wnen P = Q, 


1 


since the total number of different permutations is just equal to mI. 
A little more care is required for the calculation of the preceding 
expressions when Q = PT;,. It is clear that the function 


6 = Sin.t, Sm,£,0-0 


remains unaltered if the same permutation R is applied both to the 
spin coordinates £; and to the spin quantum numbers m, (or more 
exactly, to the indices of these variables). Any permutation P; of the 
former can therefore be replaced by the reciprocal permutation P; of 
the latter. We thus have 


SRS.M TGS = FP. Py Ms, 


mn€én 


where Tt. denotes, as before, the interchange of the coordinates ¢; and 
£,, which in the original distribution were assigned to the 7th and kth 
electrons. 


416 SYSTEM OF IDENTICAL PARTICLES § 42 

Now in the function P;'5 these coordinates will be associated with 
the spin quantum numbers P;,' = m,,and P;} = m,,, where 7’ and k’ 
are the numbers derived from 7 and & by the permutation P-}. In the 
function P;,'7;,,5 the same coordinates will be associated with the spin 
quantum numbers m,, and m,, respectively. The sum > P='8. Pz'7%,5 


will obviously be equal to 1 if these two numbers are equal (+ 4 or —}) 
and to 0 if they are different. 

Let us suppose that the numbers m,, mg,..., m,, are labelled in such 

a way that the first n, of them are equal to | and the last n_ to —} 
(n,+n_ =n). If now all the permutations P,,’ are applied to their 
indices, then each index will have an equal chance of being found at any 
place of the line, under the condition that two originally different 
indices will always have different places. 

The number of positions which any two indices corresponding 
originally to 2 and & can assume in the row of the v, positive spins is 
obviously equal to n,(n,—1), and in that of the n_ negative spins to 
m_(n_—1). The sum of these two numbers multiplied by (n—2)! will 
give the total number of distributions (i.e. permutations P). We thus 
see that in the case Q = P7;,, the expression 


mt 2.7808 = wi, Pm 


is equal, irrespective of the choice of 2 and k, to 
n(n,—1)-+n_(n_ = 
n(n— 1) 
The expression (352a) for the average value of the energy assumes 
accordingly the following form 


7_—VvP n,(n,—1)+n_ (n_—1) 
H = p E,+ x2 Fy,— m(n—1) 22. Giz» (353) 


where the negative sign corresponds to the fact that epeg = —1) for 
two permutations differing from each other by a transposition (one of 
them being of the even and the other of the odd type). Writing W® 
for the sum of the first two terms and putting m = }(n,—n_), ie. 


n, = $n+m, n_ = 4n—m, we can represent H as a function of m 
explicitly by the formula 
Fon) a Wo. se t2m—n Gg 35 
Hm) = we 22 be (3584) 


As would be expected, this expression is a function of m alone, and is 


§ 42 INTRODUCTION OF SPIN COORDINATES 417 
independent of the choice of ® out of the group belonging to a given 
m, i.e. is the same for all diagonal elements of the energy matrix. We 
can now pass on to the calculation of the characteristic values of the 
energy as functions of the resulting spin s. 

In the first place we have, according to (351) in conjunction with 


— (Yin+s__ (Minis tn+e 
Je) = Co" —Opet! = Ci“ infect (354) 
Further, according to (351 a) and (353 a), 
f(8)H(s) 
_ _[ inset E22" cinses acne 7 
= f(e)W°—| Ope EO yr sen > > Ge 
whence H(s) = W° _i eee 2. Z Gp. (354 a) 


This formula was originally derived by Heitler in connexion with the 
spin theory of chemical forces. The derivation given above is a 
modification of that given by Slater (in his theory of energy-levels in a 
complex atom) and by Pauli (in connexion with Heisenberg’s theory of 
ferromagnetism). 

Pauli’s method of dealing with the perturbation problem under con- 
sideration differs from that of Slater in the choice of the original wave 
functions with spin. Instead of taking the antisymmetrical functions 
defined by the determinant (346) we can use as the zero approximation, 
just as in the spinless case, the factorized functions obtained by multi- 
plying by each other the individual functions %,(z;)8,,,¢,. 

We shall slightly modify our previous notation by introducing the 
letters J,, J,,..., J,, to specify the different spinless orbits with which 
the separate electrons are associated and by writing (J,|x,) instead of 
7,(x;,), and (m,|é,) instead of 8,,,,. The factorized function with which 
we must start can be obtained from one of them 


$(%)B(E) = (Sy [22y)--- (Fp. |Z n)(1y [E)---(M yn |En) = (J e)(m |) (355) 


by permuting the different electrons, i.e. by applying the same permuta- 
tions P to the arguments x and é, and also taking the two possible 
values for each of the spin quantum numbers m,;. Now, as has been 
shown before, only those functions (355) must be combined with each 
other which correspond to the same value of the sum } m; = m and 
which accordingly can be obtained from each other by applying various 


permutations R (independent of P) to the indices m,. The set of 
3595.6 3H 


418 SYSTEM OF IDENTICAL PARTICLES § 42 
degenerate states which must be taken into account for the construction 
of the wave function x(z, £) stabilized for the perturbation can thus be 
specified by the expression 


(J|Px)(Rm|PE), (355 a) 
where P and A are arbitrary permutations. Since a permutation of 


the arguments (x, €) is equivalent to the reciprocal permutation of the 
indices (J, m), we can replace the preceding expression by 


(P-1J |x)(P-*Rm|é) 
or by opg = (PI \x)(Qm|é), (355 b) 
P and Q being independent of each other. 
The n! different permutations Q actually lead to 


g(m) = Ot = Cr- z= Cn-2m 


different spin factors (Qm|&) = (m|Q-1£) which are distinguished from 

each other by the coordinates é,,...,&, associated with the values m,; = 4 

and m; = —4$ respectively. In what follows we shall assume the per- 

mutations Q to be subdivided into g(m) classes, corresponding to the 

different functions (Qm|£), and shall take for Q only one representative 

of each class, treating all the permutations of each class as identical. 
The function x(z, €) can now be defined by the formula 


x(x, €) = > 2 Crore (356) 
where the coefficients Cp, are determined by the equations 
p py (Hp; pg — AI po; pe) Cr,q = 0 (356 a) 
with 
Hpg; pe = 2 { $7.0 Hdbp.g dV 
= F (Omg (Q'mie) | (| PI)H(P'I |x) aV 
and ° 


Trova =f drodr.o dV =F (mime) f (@\PI\P'I|2) av. 
So long as we are considering effectively different permutations Q and 
Q’ only we can assume the sums > (Qm|£)(Q’m|£) to vanish except for 


the case Q = Q’ when they are equal to 1. The non-vanishing matrix 
elements of H and J thus reduce to 


Apo; pg = Hp = Hpp-, Jpa.p.g =I PP’ = JIpp-, 
where HE ie and Ji are the usual matrix elements of H and J = 


§ 42 INTRODUCTION OF SPIN COORDINATES 419 


with regard to the spinless functions (PJ |x) and (P’J|z). The equations 
(356 a) can therefore be rewritten in the form 
p> (Hpp-i—H’J pp)Cp.9 = 9, 


or, if we put PP’-! = R, 


We can now make use of the fact that the only functions (356) we 
need are the antisvmmetrical ones. This means that x(Sz, S&) = eg, 
where «, = 1 for a permutation S of even type and —1 for one of 
odd type. Since the application of a permutation S or S-! to the 
arguments x, € of the functions (355 b) is equivalent to the application 
of the reciprocal permutation to the indices J, we get 


22 Cro bs" Ps'Q = Eg 2 »2 Cra Ppa: 
or, replacing S-'P and S~'Q in the first sum by P’ and Q’, 

2 2 Csr sy Pra = &s p 2 Crobpye = €s > Cp.o bP.9 
whence it follows that Cspso = €s Cpa: (357) 
This gives, if SQ is replaced by Q (i.e. Q by S-'Q) and S by R-}, 

Cropg = €rCp,ng 
so that the equations (356 b) can be rewritten in the form 
» €r(Hy—H'Ip)Cp.rg = 9. (357 a) 


it 

The index P is irrelevant, as it is the same for the whole system of 
equations and can therefore be left out of account. So far as the 
coefficients Cp ng = Cg are concerned the summation over # can lead 
to g(m) different values only, which will be multiplied in equations 
(357 a) by the sum of the expressions €,(H,—H'J;,,) for all the R’s which 

correspond to equivalent permutations RQ. 
Putting as before H = > E(z;,p,;)+ > p F(2;,x,,) and assuming the 

t wk 


functions (JJ |x) to be mutually orthogonal, we get 
(H, —H')Cg— 22 Giz Crag = 9% (357 b) 
where J denotes the identical permutation, so that 
H, == Wo= 2 E;+ ae Fix, 


while 7;, corresponds to an interchange between the spin quantum 
numbers m, and m,. 
The g(m) different coefficients C, can be specified unambiguously by 


420 SYSTEM OF IDENTICAL PARTICLES § 42 
the indices of the n, electrons with a positive component of their spin 
along the z-axis. We can thus write (following Pauli) 
Cg = Clr ys e500 Tn4)s 

where 1}, 79,..., ,, are the indices in question, C being independent of 
the order in which they appear. We can put in particular 7, = 1, 
T, = 2,...,7,, = 2, without affecting the generality of our theory, since 
the choice of the permutation Q in the equations (357 b) is irrelevant 
for their solution. Putting accordingly 


Creo = Ti, Co = Tix COs Tye %a,) = OO Tiss te) 
we can rewrite the equation (357 b) in the form 
(9, —B')C(r,. 13-2 fnJ— D 3 Gig Te C (04, 1 95---5 Tn) = 0. (357 6) 


If we consider the determinant of these equations, whose roots give 
the allowed values of the energy H’, we see at once that the sum of 
these values for all the g(m) perturbed states is equal to the sum of the 
coefficients of C(r,,7,...,7,,,) (without of course the term H’), that is, 
to the expression ¥ (A, — Y' Gy). 

r i<k 
The summation >” is extended over those pairs of states (or electrons) 
which interchange either two of the indices 7,, r2,..., 7,, or two of the 
remaining indices, 8,, 8,,..., 8, Say (corresponding to negative spins), 
without interchanging any r with any s,t whereas the summation > is 


extended over the g(m) different combinations of the r’s. Asa result we 
obtain each G,, multiplied by the number of combinations for which 
the spins associated with the states 1 and & (or the ith and the kth 
electrons) are both positive or both negative, i.e. 
Met Ota, ae Ons Mais 1) n_(n_—}) 
CrhtetCrz2 — Ch n(n—1) ——— 2 
H, being multiplied by g(m)C”+. We thus get for the average value 
of the energy H’ of the g(m) perturbed states the expression 
Him) = H,—sits— Vn) 
Him) = Hy, — "#0 ne >> Ove 
which has been obtained before. 
As has been shown by Dirac, the transpositions 7}, occurring in 
(357 c) can be replaced by operators, involving Pauli’s spin matrices 
o, and o,. Let us consider the scalar product of these spin vectors, 


t i.e. over those indices 7, k which both occur either among the n, indices r or among 
the n_ indices s. 


§ 42 INTRODUCTION OF SPIN COORDINATES 421 
that is, the operator o;a,, applied to some function of o, and o,, and 
in the first place to o; and o, themselves or their components along 
some axis, z say. We have, putting 1 = 1 and k = 2, 
(Gy°Gy)oy, = (G42 Ong +O yy Foy ty, F22)F 12 
= (047042) Cop + (Oyy Ope) F2y + (P12 Fre) P20» 

since the vectors 6, and 6, commute with each other, and further, in 
virtue of the relations (253), § 29, 

(Gy°Gy)oy, = —10yy Onp+10y 2 Ory +O, 
or 

(146, °6)oy, = 1(0y2 Ogy—Oyy Fez) +01, +02, = [2(8; X Og) +0, +07],. 

Similar expressions are obtained if o,, is replaced by o,, or o,,, 80 that 


We get likewise (1-++46,'0,)6, = is, X 6,+6,+6», 


and 6,(1+6,-6,) = 10, X6,+6,+6), 
whence (1+0,:6,)6, = 6,(1+6,°0,). (358) 
We have on the other hand 
(0 °G2)? == (Oy, Cog +Oyy Foy +0}, O22)” 

= Oh z Cig bOpy Cry tO}, 32+ Oz Foz Fry Fry t Py Fay Oz Foe t-» 

= 3+ 210), 105,+... 

= 3—26,'6,, 
and consequently (1+6,:0,)* = 4. (358 a) 
It follows from these equations that the spin operator 

O12. = 3(1+¢,:6,) (358 b) 

has the same properties with respect to any function of the spin 


variables o,, a, as the permutation operator 7;,. This becomes quite 
clear if we rewrite the equation (358) in the form 

0,20 O7,' = Ge, 
or in the equivalent form 

O;7'62 Oy, = Gj, 
which reduces to 0,,6,0;,7 = 6, 
in view of the equation O?, = 1 which corresponds to the relation 
T?, = 1 (= identical permutation). 

The equivalence between O,, and 7), is preserved with regard to the 

functions of the other spin variables o,, o,, etc., since they commute 
with o, and o,, and further with regard to any function of the type 


422 SYSTEM OF IDENTICAL PARTICLES § 42 
S (71; Te)---) Tn) Since We can replace the indices 7, of the electrons by the 
corresponding spin variables o, (or their squares). We can accordingly 
replace the permutation operators 7;, in the system of equations (357 c) 
by the spin operators O,, (the fact that the sign need not be changed 
can easily be ascertained by considering a particular case). This system 
of equations can thus be written in the standard form of a wave equation 


(W—H’')C = 0, (359) 
W= H,—}>3 ~ G.(1+6;°0,) 
= 2 E+ > pa (Fi,—3G,.)—} > Pp G4, O° Ox 
t ick tlk 


is the approximate energy operator, which is equivalent to I/ as far as 
the first approximation of the perturbation theory is concerned. 

This result, due to Dirac, is very important both from the practical 
point of view—for in many cases it enables one to calculate very easily 
the perturbed energy-levels—and from the theoretical point of view, 
for it shows that the “exchange energy’ in connexion with the anti- 
symmetry principle can be interpreted—in a purely formal way—as 
due to a fictitious kind of magnetism associated with the spin. In fact 


the expression We = —IG,6,0, (359 b) 


where 


(359 a) 


® 

can be considered as representing the energy of a fictitious magnetic 

interaction between the ith and sth electrons, their actual magnetic 

moments being replaced by quantities of an electrostatic nature. It 

should be noted that only a part of the exchange cnergy can be inter- 

preted in this way; another part —} > 3 Gi, goes over into the ordinary 
i<hk 


electrostatic energy > > Fix. 
istk 


We shall consider in Part III some important applications of the 
quasi-magnetic effects determined by (359 b) to the theory of the mag- 
netic properties of atoms and of ferromagnetic bodies. Another illustra- 
tion of equations (359 a) will be found in the theory of the chemical 
forces between two atoms, inasmuch as no other type of degeneracy 
than that due to the exchange and spin effect has to be taken into 
account. 

The above theory can easily be extended to the more general case 
when an additional degeneracy (such as that due to the different 
orientations of the electron orbits in a complex atom) must be included 
in the perturbation problem. We shall not stop here, however, to 
examine this general case. 


§ 43 METHOD OF SELF-CONSISTENT FIELD 423 


43. The Method of the Self-consistent Field with Factorized 

Wave Functions 
The reduction of the problem of many electrons to that of a single 
electron in that form in which it has been considered in the two pre- 
ceding sections is based on the description of the unperturbed motion 
of each electron in a given external field, that is, by means of an in- 
dividual wave function of a given form. Now in actual problems, 
connected with the structure of atoms and molecules, such a field can- 
not be defined beforchand in a way which would ensure the degree of 
accuracy of the zero-order approximation which is necessary for the 
successful application of the perturbation theory. We must now turn 
to the consideration of this problem, namely, the problem of the deter- 
mination of the ‘equivalent external field’ for the separate electrons 
forming a morc or less complicated system (such, for example, as a 
complex atom). 

A relatively simple method which is quite similar to that used in 
the earlier (Bohr’s) quantum theory of complex atoms, consists in the 
identification of the external field acting on a given electron with that 
of a bare nucleus (or nuclei, if there is more than one) with an electric 
charge differing from the actual one by a certain constant, which, 
divided by the elementary charge, is denoted as the ‘screening constant’ 
and is to be chosen in such a way as to represent with the highest 
possible degree of accuracy the effect of the repulsive forces acting on 
each electron due to all the rest. 

To get a more exact description of this action it is sometimes pre- 
ferable to distribute the electric charge of all the electrons except that 
under consideration in a continuous way over some surface, or in a cer- 
tain volume. with a uniform density or a density varying according to 
some more or less arbitrarily chosen law. 

In all these cases we get a problem containing a finite number of 
constant parameters which must be adjusted in a way leading to the 
least possible error. 

This problem is solved very easily—at least in principle—with the 
help of the variational form of the equations of motion, namely, 


| He av 


feesav 
where (21, %»,...,2,,) is determined as the product of n individual func- 
tions 4(%1; 41, 6,,...), Wo(%e; Ma, Bg,...),.065 By(®_3Gn,0,)---) Of known form, 


? 


424 SYSTEM OF IDENTICAL PARTICLES § 43 
containing a number of undetermined parameterst aj, @,,..., etc. [ef. 
§ 9, Chap. IT]. 

Under these conditions the expression W = f{ d*H¢d dV/ { d*d dV, 
which is equal to the energy of the system, is defined as a certain 
function of the parameters a, whose values must be determined from 
the equations 


Way “ao, ., Mao ug 2 metiete: 


0a, 

The equation 5W = 0 can be used, however, not only to adjust the 
values of a finite number of parameters introduced in the more or less 
arbitrarily specified functions y,,..., ¥,,, but also to determine these func- 
tions themselves without the explicit introduction of any parameters 
(implicitly they are contained in the definition of the functions yf if the 
latter are supposed to be expanded in some sort of series). Now the 
factorized form of the wave function ¢ describing the behaviour of 
the whole system of electrons corresponds to the possibility of assigning 
to each of them a separate ‘orbit’, i.e. a motion independent—explicitly 
—of that of the rest (in the sense of the wave-mechanical probability 
interpretation). Inasmuch as the variational principle 5W = 0 ensures 
the highest accuracy of the results consistent with any given assumption 
about the character of the motion, we can thus state that the most 
accurate description of the motion of a system of electrons in terms of 
the quasi-independent motions of the separate electrons is obtained by 
defining the functions ¢,(2%,), ¥2(22),..., ¥,(X,,), describing these individual 
motions, with the help of the variational equation, with 


P(2Xy,.-.) X,) ae $3(2)...Py(%,)- 
The above method has the advantage of avoiding the introduction of 
an arbitrary effective external field for each electron. Such a field is, 
however, introduced implicitly and can easily be determined in an 
explicit form, This is the so-called ‘self-consistent field’ which we have 
already alluded to many times, and which was applied for the first time 
to the problem of complex atoms by Hartree. 

In his original theory of the self-consistent field Hartree did not 
make any use of the variational principle (which was introduced for 
this purpose later on by V. Fock and J. C. Slater) but was guided by 
the idea that the action experienced by one electron due to the rest 
can be calculated approximately by distributing in space the electric 


ft We shall leave aside for the time being the complications arising from the spin 
effoct. 


§ 43 METHOD OF SELF-CONSISTENT FIELD 425 
charge of the latter with a density proportional to the probability of 
their respective positions. The contribution of cach electron to the 
probable density of charge p at a given point is obviously given by 
Py = ely,(x)|? under the condition that all the individual functions 
are normalized to 1: 

[ Walz)? dx = 1 


(where dx is an abbreviation for the element of volume dxdydz). The 
potential energy U; of the zth electron with respect to all the others 
can be determined accordingly by the expression 


OG, = 5 U;;., 
kei 
where Cenc | Walaa)? da’, 
Vik 
or, with a slightly different notation, 
4 = 2 \2 ’ « 
Ur) =e fies sat 1D ber’) /2 dV’. (360) 


Adding to this expression the potential energy U,(r) of the external 
forces (which must obviously have the same form for all the electrons) 
and substituting the resulting ‘effective’ energy 
U(r) = Up,(r)+ Ui(r) (360a) 
in the Schrodinger equation 
2 
| — gigign V+ Ur) —Wi |e = 0, (360 b) 
we can determine the wave function y¥; describing the motion of the 
electron in question if the functions ¢, (k + 7) describing that of the 
other electrons are supposed to be known. Now as a matter of fact they 
are not known beforehand, each of them being determined through the 
rest by an equation of the form (360b). We obtain in this way a system 
of n integro-differential equations which can serve for the simultaneous 
determination of all the 7 individual wave functions y,,..., y,. 

It may seem at first sight that the total energy W of the whole 
system is equal to the sum of the individual energies W;. This is, how- 
ever, easily seen not to be the case. In fact multiplying equation (360 b) 
on the left by %* and integrating, we have, in view of the supposed 
normalization of y,, 


W, = [or |- sein +0, ht 


3505.6 


426 SYSTEM OF IDENTICAL PARTICLES § 43 
or, according to the definition of U,, 
2 
W, = { s|-, vi +0 y+ > =| av, 
epi "tk 
whence it follows that 


n h2 
rm = [13 (state) + 2D =] sav, 
i=l i= thai 


whereas the actual value of the total energy, corresponding to our 
approximation, is 

—_ * Pie * 

W fe Ho al [¢ 2 o Vi+Uni) +; 2») < lear, 
the mutual potential energy of all the electrons thus being doubled in 
the expression > W;. 

In order to calculate the total energy W’ with the help of the ‘partial 
energies’ W; we must introduce in addition the ‘proper energies’ of the 
separate electrons 


h? h? 

— 4 _- V24-U,. Ws. da = | em 73 . y 

= [oe (= gay, VEU) be de = foe i+ Ua) a. 
Denoting their sum > E; by E, we get 

W—} DW = JE, 
whence W = (E+ > W) = > (E,+W)). (360) 
t=1 t= 
It should be mentioned that Hartree’s self-consistent field can be 

defined cither by the resulting probable density of the electric charge 
p=e 5 \b;|*, from which the electric potential with due allowance for 

1=) 


the contribution of the external field can be derived by means of 
Poisson’s equation, or by the electric density p; = p—p; = p—ely,|? 
and the potential energy (360) which corresponds to an electric field 
of specific form for each of the electrons. 
We shall now come back to the variational equation in the form 
8 | $*Hd dV = 0, (361) 


with ¢ defined as the product y,(z,)...¥,(z,), and the n additional 
normalizing conditions f /*4, dz = 1 or 


§ | wy, dx = 0. (361 a) 
Wehave 5 f $*Ho dV = f Sd*Hp dV + [ *H8¢ dV. 


$43 METHOD OF SELF-CONSISTENT FIELD 427 
Now in virtue of the self-adjoint character of the operator H (which we 
shall suppose to involve rea] quantities only) we have further 


f $*H8¢ dV = i 8¢H4* dV, 
so that (361) can be written in the form 
[ 86*H¢ av + i 8h6Hd* dV = 0. 

Substituting here the product #,(x,) ...y,(x,) for ¢ we get 

5yb* : Pay 220), 

3 J er Teta av + & | dy, T1 eH a} 

If we subtract from this equation the n equations equivalent to (361 a) 

f svt Th vtgav + fy. Ty. g* av = 0 

kei kei 


multiplied by suitably chosen parameters, A; say, we can equate to zero 
the coefficients of all the variations 5y* and 5; (Lagrange’s method of 
undetermined multipliers). This gives 

(H;—A,)$; = 0, (362) 


where = {Ol tH Il by. U da. (362 a) 
xi Hi ry, 


is an operator which can be defined as the average value of the actual 
energy operator H for a given position of the 7th electron and for all 
the configurations of the other ones. Similar equations are obtained by 
equating to zero the coefficients of the variations 5, with H; replaced by 


Hi = { i pb, H at Pr i dz,. They need not be considered separately 


for they are actually equivalent to the equations (362). The latter 
provide the mathematical justification for the physical principle which 
was usecl by Hartreet and are practically equivalent to Hartree’s equa- 
tion (360b) if H is determined, as usual, by the formula 


H=S E(x, p)+4 > Flax)... (362 b) 
1=1 i=L ki 


. e2 
with F(x;,2,) = —. 
Vik 
The only difference between them consists, as is easily seen, in the fact 
that H, involves in addition to the proper energy of the ith electron 
2 
— a Vi?+U,; and its average potential energy with respect to the 
rest, the average of the energies of all the other electrons. Hence the 
+ It may be remembered that essentially the same principle had been used before 
by Schrédinger in connexion with his attempt to re-establish the wave theory of light 
emission on the basis of wave mechanics. (See Part I, § 17.) 


428 SYSTEM OF IDENTICAL PARTICLES § 43 
constants A; appearing in (362) are casily seen to have the same value, 
namely, W, the total energy of the system. It should be mentioned 
that the normal state of the latter corresponds to the condition that 
W should have the least possible value of all the ‘stationary’ values 
which are allowed by the variational equation (361), in conjunction 
with (361 a). 

The preceding theory applies not only to a system of electrons but 
just as well to a system consisting of different particles or indeed 
of systems of any sort if x; denotes the totality of the coordinates 
specifying the state of the corresponding elementary system and if the 
total energy (362 b) is written in the somewhat more general form 


H = > E(x, p;)+ bap F (2, X,). (362) 


44. The Method of the Self-consistent Field with Antisymmetri- 
cal Functions and Dirac’s Density Matrix 

Tn the particular case of a system of electrons the accuracy of Hartree’s 
method is limited not only intrinsically but also by the fact that a 
specific distribution of electrons among the 7 orbits y,,..., 4, such as 
that defined by the function ¢ violates the identity principle. The 
function ¢ defined by the product ¥,(x,)...%,(z,,) must serve merely as 
a starting-point for the perturbation theory which has been considered 
in § 37 in connexion with the exchange degeneracy. 

Instead of accounting for the latter a posterior: we can take it into 
account from the beginning if we replace the factorized function ¢ in 
the variational equation by a linear combination of such functions, 
corresponding to the different permutations P of the electrons between 
the individual states y,,..., W,,: 


x = 5 Cp Pd. 


The functions ¥,,...,%,, obtained in this way will of course be somewhat 
different from those which are defined by the equations (361c) and 
which do not involve the exchange effect. As to the coefficients Cp, 
they can be shown to be the same as in the case of the perturbation 
problem corresponding to functions y,,...,%,, known @ priori. 

We shall determine the latter for the antisymmetrical functions with 
spin which have been dealt with in the preceding section. We put 
accordingly 

#1(%,6) . -» - YalTnrFn)| 

“EE SCs x «2 HS Ss 

Py(X; é;) . * ? Pr(Lq, En) 


=C p2 €p P¢(z, £), (363) 


§ 44 ANTISYMMETRICAL FUNCTIONS 429 
where (2,6) = P(2)8 mil) 


and P(X, £) = (21, &s)o(%e, Eo) --- Wn (Ens Fn): 

We shall further assume for the sake of simplicity all the individual 
wave functions with spin not only to be normalized but also to be 
mutually orthogonal in the sense of the equations 


> f ote Oala, €) de = Bi. (363 a) 


It should be mentioned that if this orthogonality condition were not 
fulfilled for the original wave functions ~ we could replace them by 
certain linear combinations satisfying these conditions. The a priori 
introduction of the latter does not therefore impair the generality of 
the theory. It serves, however, materially to simplify its external form. 
The normalizing condition for the function (363) under the assumption 
(363 a) gives C = 1/,/(n!). 

It will be convenient in what follows to write x, for z;, ; und § for 
> j, thus keeping externally the notation sonnsapenting to spinless 
functions. We can formally proceed in the same way as if we were 
dealing with an antisymmetrical function (363) without spin.t Sub- 
stituting it instead of ¢ in the variational equation (361) (which in our 
case should be written in the form 5 > { y*Hy dV) and taking account 

g 


of the self-adjointness of the operator H, we get as before 
| 8x*Hx dV + | 8xHy* dV = 0 (364) 


(the summation over the é’s being understood). 
Now we have according to (363) 


1 
8y* = aa 2 ep P8¢*, 


and further, since the integral { Pi¢*HydV (or more exactly 
> J Pid*Hx dV) does not change if any permutation, P~ in particular, 
3 


is applied to all the integration variables, 
1 
= = —— 56*H P-y dV ; 
[ 8x Ay dV yaa? | p x dV; 
or finally, since P-!y = epx, 


. = > | 3¢*Hy dV = yin') | 86*Hy av 
J BxtHix av = 7D, | obtHix av = sind | oedty ai 
a 


+ The variations 5y must of course refer to the factor y¥,(2) only, leaving the spin 
factor 5,y(¢) unaltered, 


430 SYSTEM OF IDENTICAL PARTICLES 544 
and in the same way 


[ 8xixt dV = s(n!) [ 8AAx* a. 


If we now substitute for H the operator (362b) (by definition not 
involving the spin) and replace 4/(7!)y by the expression 2 ep P¢, we get 


f 8xtHy d¥ = Dep [ 8$*H Pg dV 
I’ 
= Fen Sf ager, p)Poav + > > { 86*F(w;,2,)P6 dV}, 
t t-1 eck 


The integral § 86*E(x;,p,)P6 dV, where 84 = > 84* T] oz, is casily 
j=l k¥) 


scen to be different from zero only if P denotes the identical permuta- 
tion (because of the orthogonality conditions f Pf, dx == 5,,) when it 
reduces to f d$* E(a;.p;)b; dz;. We have further, if P is the identical 
permutation, 


[ dd" F(w;,24) Pg av’ 
iad { i [Pe (ar, OPE (2°4,) + PF Uy OPE (a) | PCr. ey) (Ve (Tn) dey ay, 


and 
f 86*F(@,,2,)P$ dV 
= ff [ote dByt er) + PE BbE (ed) F er xe bile ales) da; dary 


if P is equal to the transposition 7,,, i.c. the interchange between the 
ith and kth electrons, and zero in all other cases. We thus get, on 
account of the symmetry relation F’(2;.x,) = F(x,,x;), 


[ BxtHy a = > fdxsS0%[[BQavpd-+ Sf doy Fen ze)lbelee)l yale) — 


— 2, | dx, F(x;, xy) pe (2 bala) | vax} 
Putting for the sake of brevity 


A gale) = [ Pla,’ Wk a'\,(a’) de’ (365) 
and B(z) => Aig(2), (365) 


we can rewrite the preceding expression as follows: 


f 5x*Hy dV 
=> J dedp*(x){[ B(x, p,)+ Bee)Wale)— ¥ Ary dx(z)} = 0. (365) 


§ 44 ANTISYMMETRICAL FUNCTIONS 431 
Subtracting from this equation and the conjugate complex equation 
j 8xHy* = 0 the expressions 


Agi | Spi (x) (x) dx + Ags { Pi (x)dp,(x) dx = 0, 


which are derived from the orthogonality and normalizing conditions 
j dF, dx = 8,,, and equating to zero the coefficients of the variations 
5y*, we obtain the following system of equations for the functions ¥,(z): 


(B+ Bybi(e)— ¥ (Ags trial) = 0, (366) 


and a similar system for the conjugate complex functions. If we 
multiply these equations on the left by ¥*(x) and integrate over x 
(including summation over £) we get, in virtue of the orthogonality 
and normalizing relations, 


Mi = [ WMO B+ Byyi(a) dx — Sf Aged aula) dx 
or, according to (365) and (365a), 
Me = Est [f Peee'Wheile) S oho’ Wale’) dxde’— 


— ff Fee. 2° Wt erbala’) S UC Wal) dda’ 
where By = [ Whe) E(w, pile) dex (366 b) 


are the matrix elements of the proper energy of an electron [including 
its external potential energy U,(x)] with respect to the states 7 and j. 

Although the coefficients A,; are completely determined by these 
equations, they can actually be considered as arbitrary constants form- 
ing an Hermitian matrix, i.e. satisfying the relations A%. = A,;, and 
further subject to the condition that the diagonal sum > A,; should 


have a given constant value. 

This conclusion follows from the fact that the set of normalized and 
orthogonal wave functions ¥; can be replaced by any set of linear com- 
binations of these functions, provided the transformed functions 


pir = > Cv Pb, 


also satisfy the normalizing and orthogonality conditions. In fact the 
functions A ,, are transformed by (365) according to the equations 


Avy aa p p Cvs Cyr A sks 


i.e. like the components of a tensor in the n-dimensional space, whose 
coordinates are defined by the values of the n functions y¥,(r). The 


, (3662) 


432 SYSTEM OF IDENTICAL PARTICLES § 44 
latter can also be considered as the components of a vector referred 
to a certain set of orthogonal coordinate axes, wi, being its components 
with respect to another system of such axes (with the same origin). In 
other words, the equations (366) can be considered as invariant with 
regard to all the orthogonal transformations or ‘rotations’ of the co- 
ordinate axes, if the coefficients d;, are likewise defined as the com- 
ponents of an arbitrary tensor A, the operators E and B being obviously 
scalars. 

As has been pointed out by Dirac, the arbitrariness involved in the 
determination of the components y,(x) of a vector (x) can be removed 
if instead of such a vector we consider its scalar product with the con- 
jugate complex *(z’) of a vector (z’) associated with some other 
point x’. This product, which will be denoted as 


p(x, a") = > viet e'), (367) 


is invariant under the above transformation and is therefore the only 
quantity that can be determined unambiguously in connexion with our 
problem. It can, moreover, easily be shown to be the only quantity 
we actually need know, the energy . 


W = | x*Hy dV 


of the system of electrons being expressible as a function of p. 
In fact the preceding formula is reduced (in the same way as the 
expression { 5y*H x dV) to the form 


W= f o*Hy dV = Sep | o*H P¢ aV, 
or, if the energy operator is defined by (362b), 
W = { o*H¢ qv—y> | $*HT,,¢ dV. 
Hence we get in the same way as in the derivation of (365 b) 
W = S Beth ff Fee 'Nelw.z)ete’.2')—\ple.z')P]}aV, (3678) 


where |p(x, x’) |® = p(x, x’)p(x’, x). 
It should be mentioned that the sum > A,, differs from this expression 
+ 


by the absence of the factor } in the second term which corresponds 
to the mutual energy of the different electrons, so that we can put 


W= D> (Eig +Aj:). 


§ 44 ANTISYMMETRICAL FUNCTIONS 433 
The quantities 4,; are thus easily seen to correspond to the partial 
energies of our previous theory. 


The integral 3 {f F(x, 2x’)p(x,x)p(x’,x’) dxdz’ with F(z,z’) = ee) 


represents the mutual potential energy which is obtained if the charges 
of the electrons are distributed in space with a volume density e|y,(x)|?, 


ep(x,x) = e > \b,(x) |? being the resulting density of the ‘electron cloud’. 


This fiicludes. the action of an electron spread out into a cloud upon 
itself, which is devoid of physical meaning. Such self-action is, however, 
cancelled out by the second integral on the right side of (367 a), 


—4[[ F(x,x')ip(x,2')|? dade’, 


which also represents the exchange effect or, as it is usually denoted, 
the ‘exchange energy’ of the electrons.t 

The first term in (367 a) does not seem at first sight to be consistent 
with the representation of the energy as a function of the ‘density 
matrix’ p. If, however, we introduce the clements of the electron’s own 
energy matrix EL from the point of view of the coordinates x 


E(z,2') = | 82-2") E@", pz )8(e"—2') dx” 
(cf. § 17), we can put, since £;, = § pF (x)Hy,(x) dz, 
5 By, = ff Ble.2')p(x',x) dda’ (367 b) 


The fact that the energy W = Jf y*Hy dV is expressed as a function 
(or rather a ‘functional’) of the density matrix p alone, shows that the 
latter can be determined directly without the functions ¥,(2).,...,%,(z) 
which have initially served for its definition. Multiplying the equations 
(366) by %*(x’), subtracting therefrom the product by #,(z) of the corre- 
sponding equations for the conjugate complex of ¥,(z’), and summing 
over 1, taking into account the relations A}, = A,, and A¥. = A,;, we 
can eliminate the coefficients A;, with the result 


[E(@, pz) + B(x) — Ee’, py)— Ble’) & pile F(2")— 

— DD [Anilz) Arle’) (a)o*(@’) = 0. (368) 

+ As has beon stated at tho beginning, the integration sign in tho preceding equations 
actually means both integration with regard to the geometrical coordinates and a sum- 
mation over the spin coordinates. The latter can easily be introduced explicitly in the 
final results. They arc, however, wholly irrelevant so long as we are dealing with a 
spinless energy. Their only effect is to allow the introduction of doubly occupied spinless 
states ¥,(2) (with opposite spin) without the violation of Pauli’s exclusion principle. 
As a result we get a number of relations of tho form Ap(xz) = Au(x) = Ag(z) and 


Ay(z) =- Ay(w) for indices i and & which correspond tu identical spinless stater. 


434 SYSTEM OF IDENTICAL PARTICLES § 44 
If we substitute here the expression (365) for A,,(x) with zx’ replaced 
by x” and similarly put A,,(x’) = J F(x’, x” )pt(x")p,(x”) dx”, we obtain 
the following equation, containing the density matrix alone, 

(Z,+ B,—E,—B,)p(z, 2')— 

a J (F(x, 2")—F(2’,2")]p(x,2")p(x", 2") dx” = 0, (3682) 

where £#.,, etc., is an abbreviation for E(x, p,), etc. 

Introducing a matrix K defined from the point of view of x by the 


formula K(a,2') = E(x, 2')+8(a—2')B(x')— A(z, 2’), (369) 
where A(z,2x') = F(x, x')p(z,x') = e ee) (369 a) 


and, according to (365 a), 
- ptet ade at | COT ay 
B(x) | Fee p(z’,z') da’ =e | eae ax (369 b) 
we can consider the left-hand side of the equation (368 a) as the (z, 2’) 
element of the matrix Kp—pK and accordingly rewrite it in the fol- 
lowing matrix form: Kook = 0. (370) 


It should be mentioned that the matrix A(z,x’) subtracts from the 
matrix B(x,x’) = 8(z—2’) B(x’) physically irrelevant terms correspond- 
ing to the action of an electron upon itself and at the'same time accounts 
for the exchange effect. 

With the new matrix notation we can rewrite the expression of the 
energy as a function of p derived above 


W = ff dxdx’ { E(x, x')p(x’,x)+4F (x, x’)[ p(x, x) p(x’, x’) — |p(x, x’) |?]} 
(371) 
in the form 
W = D[p(E+}B—}A)] = D[(E+4B—}4)p], (371) 
where D(J/) is an abbreviation for the so-called diagonal sum (German, 
Spur), i.e. sum (or integral) of the diagonal elements of the matrix Y; 
in the present case we have 


D(M) = | M (zx, x) dz. 


The equation (370) which is satisfied by p can be obtained directly, 
i.e. without the use of the functions ¥,(7),...,~,,(2), from the variational 
equation 5W = 0. With the expression (371) for W we get, since 
F(x, 2’) = F(z’,z), 
sW = i} dada! { E(x, 2')8p(zx’,2)+ 
+F(x,2' Lp(a, 2)8p(2", 2") —p(x, 2’ )Sp(e’,x))} 


§ 44 ANTISYMMETRICAL FUNCTIONS 435 
— ff dada’ [ E(x, x’)+8(x—x’) B(x')— A(x, x’) |8p(x’, x) 


= ff dx'dx Sp(x, x’)[ E(2’,x)+8(x’—x)B(x)— A(x’, x)], 


that is, according to (369), 
dW = D(ip K) = D(K 8p) = 0. (371 b) 
It must not be concluded from this equation that A = 0, for the 
matrix p satisfies a certain accessory condition which is obtained by 
comparing it with its square. 
We have in fact, from the definition of matrix multiplication, 


prea!) = J plx,2")p(x",2") dx” = SS | dil h(e” Wha’ Wala") dx” 
= > ps B(x) PR (28:4, = > h(x) pF (x') = p(x, 2’) 
(because of the orthogonality and normalization of the functions y;), i.e. 
ee (372) 
It follows that 5p = p 5p+8p p, that is, 
Sp(x', x” _ | p(a’,2”) Optex") dc” + [ Sp(a’,2”)p(a”, 2") da” 


which in conjunction with (371 b) leads to (370). 

The relation (372) shows that the characteristic values p’ of the matrix p 
are equal either to 0 or to 1 (since they satisfy the same equation p’? == p’). 
We thus obtain, according to Dirac, a new formulation of Pauli’s 
exclusion principle, for although the matrix p can be introduced ir- 
respective of the statistical properties of the particles under considera- 
tion, yet it can be shown to possess a dynamical meaning—in the 
sense of describing the motion of a system of particles—for the Pauli- 
Fermi statistics only (see below, p. 463), so that Pauli’s principle is 
expressed implicitly by the property (372) of p. 

lf in the equation (367) we sum over all values of 7 specifying a 
complete set of individual wave functions %—which corresponds to 
n= 00, 8o long as all these wave functions arc normalized and 
orthogonal to each other, we obtain 

p(x, x’) = 8(x—2’) (372 a) 
This expression can be used as an approximation to p for large values 
of n. It is easily seen to satisfy the relation (372). 

The preceding results can be generalized for a non-stationary motion 

of the electrons, determined, in the method of the configuration space, 


by the equation h @ 
( + 5\x = 0 (373) 


436 SYSTEM OF IDENTICAL PARTICLES § 44 
In order to obtain the corresponding generalized form of the equations 
(366) or of the equation (370) for the density matrix, we need only 
remember that the equation (373) is equivalent to the variational 


equation h a 
| ax*(H + 5 5) yay ad: (3738) 


[cf. § 26, eq. (207a)]. Now 
ox a od 
a g = * = “or 
| 8x = dV > ep | 8 yee [8 a dV 


~ > foe B+ (d J Sean) 


Putting for the sake of brevity 


op; e 


we thus get 


-s5]? age av = > f ot(-s5 stb dde. 


Equating this expression to the expression (365 b) for { 5y*Hy dV, and 
taking account of the orthogonality and normalizing conditions in the 


form 
[ SY ele) dx = 0, 
we get instead of (366) the equations 
h @ n 
(2+8+5, suite) & (Agi tOxs)o,(z) = 0, (374) 


where b,; are numerical coefficients, or more generally functions of the 
time. 

These can be determined in the same way as the coefficients A,,, 
i.e. by expressions similar to (366a) and differing from the latter by 


additional terms 5— a a pe — es dz: 


= Nato : = | ut Sa 1h ae (3748) 


Hence we see that they must satisfy the conditions 


and are otherwise quite arbitrary. ee: the sum 


Zh = Edt ae =, | at Eide, 


§ 44 ANTISYMMETRICAL FUNCTIONS 437 
we have, according to (374), 


¥ bis = Tau ZI] B+ BM dx + FF [ (Ant bedbtyn de. 
Now in view of the orthogonality and normalizing relations 
p32 2 | bee bib, dx = p3 bi. 
The sum > b,; thus drops out of the preceding equation which reduces 
to the squerion (367 a) for > Aj. 


The arbitrary coefficients 6,; can be eliminated from the equations 
(374) in the same way as from the equations (366), namely, by multi- 
plying (374) by J*(x’), subtracting the product with (x) of the corre- 
sponding (conjugate complex) equation for ~*(z’), and summing over ?. 
We thus get, instead of (368), 
is Beet = (E,+ B,—E,— By)p(x,x')— 

a f [ F(a, x")—F(2’,x”)|p(x,2")p(x", 2’) dx”, (375) 


or in the matrix form corresponding to (370) 


h ap . 
—_ — --- = — : 5 
ne Kp—pk (375 a) 
This relation should be distinguished from the expression 
h dM 
ao > (7#M—MH) 


for the time derivative of any matrix or operator which is specified in 
terms of the same variables as the Hamiltonian H of the system and 
which docs not contain the time explicitly. If K is considered as the 
energy matrix of our system of electrons reduced by the method of the 
self-consistent field to a single particle, then the expression 


5 
=" (Kp—pK) = [K,p] 


gives that part of the time derivative of p which corresponds to the 
rate of change of the dynamical variables (2 and p, for example) 
through which it can be expressed. The total derivative of p with 
respect to the time will thus be 


dp op, 2nt = 


in virtue of (375a), which means that p is a constant of the motion 
determined by K. 


438 SYSTEM OF IDENTICAL PARTICLES § 44 

The total energy of the system of electrons W as given by (371) or 
(371 a) is not a matrix but an ordinary number. If the external field 
involved in the proper energy of an electron E does not depend upon 
the time, W can be shown to be a constant of the motion. We have 
in fact, in exactly the same way as in the derivation of (371 b) 


ay | x], 
dt at 


or, according to (375a), 


fh aW _ Di(Kp—pK)K] = D[KpK]—D(pK®), 


Qn dt 
which is easily seen to vanish. 
The matrix p(B4-4B--1A) = 4o(B-+-K) 


could be formally defined as the cnergy matrix of the system of elec- 
trons, without, however, attaching any dynamical meaning to this 
definition, for it is the matrix AK only which is entitled to play the role 
of the energy matrix for a single particle. The matrix A differs from 
an ordinary energy matrix such as #, by the fact that it is itself deter- 
mined by the character of the motion, and that accordingly it cannot 
be represented by an operator of the usual form (p?/2m)+ U(x) even 
with an unknown potential-energy function U(x). 

One might be tempted to replace the equation (375a) by an 
equation of the usual salle type 


= 55; SY == et) 


The latter can in fact be shown to be equivalent to (3754) or to (375) 
in the special case of a single electron (but not otherwise). Replacing 
Kk by an energy operator of the ordinary type, # say, multiplying the 


equation me a 2 yx) - Bebe) 


by W*(z’) and beni from it the equation 


+t ene) = Hed’) 


mrt ot 
multiplied by os we 


a mi 2 pm (ayy(e’ )} = (F,—E,(x)*(z’), 


which is a special case of (375b). The equation (375) or (375a) can 
thus be considered as the generalization of the wave mechanics of a 


§ 44 ANTISYMMETRICAL FUNCTIONS 439 
single electron, which makes it possible, through the introduction of the 
density matrix p(z,z’) instead of ordinary wave functions, to describe 
the motion of a system of n electrons in exactly the same way (with 
a modified definition of the energy matrix K) as the motion of a single 
electron. 

The complete disappearance of the number of electrons from the 
equations of the general theory seems at first sight very puzzling. This 
number must obviously be introduced a posteriori as an integration 
constant, or more exactly as a sort of quantum number, specifying the 
system under consideration. 

We thus see that the theory of the density matrix naturally leads 
to a further development of quantum theory in the sense of second 
quantization discussed already in Part I (see next chapter). 


45. Approximate Solutions (Thomas-Fermi-Dirac Equation) 
Using Dirac’s notation for the matrix elements and for the wave func- 
tions we can transform the matrix p from the point of view of z to 
that of K with the help of the following equations: 


(K’|p|K*) = {{ (K'|2") da’ (a'|p|x”) dx" (x"|K"), (376) 


the matrix elements (K’|p|K”) and (zx’|p|x”) being both of the ‘pure’ 
type, corresponding to a definite point of view. We can, however, define 
in a similar way the mixed elements of p corresponding to a ‘double’ 
point of view (K, #) which serves to connect the two matrices K and 
E with each other. These clements are given by the formula 


(E'ip|K’) = [{ (B'|2’) de’ (@'|p|x") dx" (w"|K"), (376) 
which is similar to the equation 
(B’|K’) = | (E’ |x) dx (2|K’) (376 b) 


for the transformation coefficients (H#’|K’) (cf. § 18), and reduces to it 
in the limiting case n = o0 according to (372a). The wave function 
(x|K’) appearing in the transformation equations (376a) and (376 b) 
replaces in a certain sense the whole set of individual wave functions 
b,(2),...,W,(z) associated with the given value of K’, in agreement with 
the fact that each of the n electrons on account of the exchange pheno- 
menon must be distributed over all of them. 

The introduction of the wave functions (x|A’)—although it is by no 
means necessary nor even convenient—raises the question as to the 
possibility of representing the energy K as a function (of a perhaps 
unusual type) of the dynamical variables x and p (= p,) used in the 


440 SYSTEM OF IDENTICAL PARTICLES § 45 
wave mechanics of a single electron. Now, since K is defined as a 
function of the density matrix, this question amounts to the trans- 
formation of the latter from the original viewpoint of x to the ‘mixed’ 
viewpoint (z,p). In other words, we must find the transformation from 
the ‘pure’ matrix elements (z|p|x’) to the ‘mixed’ matrix elements 
(p’|p|x’). This transformation is given by the equation 


(x\pix’) = { (x\plp) dp (plz) (377) 
or the reciprocal equation 
(z\p[p) = | (xlp{x’) dz’ (x’|p), (3774) 
where (x|p) = (p|z)* is the well-known function 
(x |p) = eftex-Plh, (377 b) 


Xp being an abbreviation for rp,+yp,-+zp,. The function (377 b) is 
understood to be normalized according to the condition 


f (eipyr' ey dx = 3(" 7"). 


We shall give here, following Dirac, an approximate solution of this 
problem by treating x and » as ordinary, i.e. mutually commuting, 
quantities in the sense of classical mechanics. The density p (as well 
as the energy A) will thus appear as a function p(z, p) of the coordinates 
and momenta of an electron, its product with the volume element of 
the phase space dxdp being proportional to the probability of finding 
the electron in this volume element, or, in other words, to the relative 
number of electrons to be expected in the latter. This physical meaning 
of p will become apparent from the following argument. 

Let us consider p(x, p) = p,(p) as a function of p for a fixed value 
of x and expand it in a Fourier integral} 


polD) = f pageiPri de. (378) 
This expansion is quite similar to the expansion of a function of the 
time ¢, the coordinate of an electron for example, for a motion with 
a fixed energy W fo(t) = f fro @2mel dw, 


w/h being the frequency. Now in the latter case the Fourier coefficient 
Sw.w is well known (by the correspondence principle, Chap. ITI, § 12) to 
represent approximately the matrix element of f (W|f|W--w) for two 
neighbouring states with the energies W and W-+-w (provided w < W). 


+ It should be remembered that z and p are meant to-denote tho triplets of coordinates 
and momenta, and that dz actually means dzdydz. 


§ 45 APPROXIMATE SOLUTIONS 441 

Since a coordinate zx and the corresponding momentum p, are related 
to each other in exactly the same way as the energy and the time 
(being canonically conjugate quantities), the Fourier coefficients of 
p.(p) in (378) must likewise represent approximately the matrix cle- 
ments (x|p|z-+) of p. The function p(2, ») corresponding to the classical 
definition of p (as a quantity commuting with ) can thus be calculated 
with the help of the matrix (x|p|x’) by the formula 


p(x, p) = [ (xlpix+ geri" dé. (378 2) 
Comparing this with (377 a) and (377 b) we obtain the relation 
(x|p|p) = p(x, p)e?7?™". (379) 
The Fourier coefficients in (378) can be calculated by the formula 
prt = (alple+&) = f ple, p)e?*P8"* dpjh®, (380) 
h® appearing instead of A because dp actuaily denotes here the product 
dp, dp, dp,. 
Putting here & = 0, we get 
1 
(lela) ~ 73 | el.) dn, (380 a) 


whence it is clear that p(x,p) can be defined as the probable number 
of electrons per volume h3 of the phase-space. 

The preceding equations are obviously valid not only for the matrix 
p but also for any other matrix of the same type, and in particular for 
the energy matrix A. Expressing it as a function of the variables x, p 
we thus get K(x, p) = Ele,p)+ Bx, p)—A (ap), (381) 
where E(z,p) is the usual (classical) expression for the electron’s own 
energy p?/2m+ U(z), 

Blx,p) = B(x) | B(g)etP 4" dé; 

that i = Bia ek | OE) gy 

at is, B(x, p) = B(x) =e | rat dx’, (381 a) 
the usual expression for the Coulomb energy of an electron in a cloud of 
electric charge with the volume density ep(x’, x’), and 


A(x, p) == | (x}A ja+E)et2mrtin dé, (381 b) 


Taking for the matrix (z|A|z+£) = A(x,2+€) its expression (369 a) and 
substituting for (x|p|z-+-£) the expression (380), we get 


A(z,p) = ff Fz2+£)p(e,p'e*n@-P4 dp’ hs 
3505.6 3L 


442 SYSTEM OF IDENTICAL PARTICLES § 46 
Now (p—p’)-& is the scalar product (p,—p,)&,+(py—Py)Ey+(P,—Pe és 
of the vectors p—p’ and&. Keeping the vector p—p’ = § fixed and de- 
noting by 6 the angle made with it by the variable vector &, we can replace 
the volume-element dé (= df, dé, d&.) by the expression 27€d|§| sin 6 d0; 
since F'(z,x+£) depends on the magnitude |§| = r of the vector & only, 
we can carry out the integration over 6, keeping r constant. This gives 


+1 
| el2re-gih dgé _ Dr? dr { el2rgr cos Oh d(cos @) 
= 
sin(2mgr/h) 


= Irdr ah ; 


ger? fp ER(2agr/h) 
-_ ngrjh 


and consequently 


| (xlA lx-+E)ei2Ptlh dé = 2¢2 i Pelee) | dr sin(2ngr/h). 
0 


Ps ] we 
Now J sinardr = — - [cos ar], 
which is equal to 1/a+-an indeterminate constant which can actually be 
dropped. In fact, if instead of integrating over r to 00 we first extend 
the integration to some large finite value, R say, and pass to the limit 
R = o, after carrying out the subsequent integration over p’, the term 
containing R vanishes. We thus a finally 


ee p(t.p') 4 , 5 
A(x, p) A = ip— p \2 dp’. (382) 
If the function p(x, p’) is replaced here by the function f(x, p’) = p/h’, 
giving the probable number of clectrons per unit volume of the phase- 
space, the preceding expression assumes the form 
Az,p) = 2 [ LeP) er (382 a) 
which shows that the eieapteie energy, being purely a quantum effect, 
vanishes with h, as of course it should provided the function f remains 
finite (which simply means that the number of electrons is finite). 
If the function f(z, p’) vanishes for large values of p’, then for suffi- 
ciently large values of p we can put approximately 
S(z.p') dp’ _ 1 f \ dp’ = 2 
ee a z, )d = —~ (x, 2x), 
Fis = eee 
and consequently eth 
A(e,p) =e colt, 2). (383) 


§ 45 APPROXIMATE SOLUTIONS 443 

It was shown by Fock that this expression can be applied for all 
values of p in the case of an electron moving in the electric field of 
a number of other electrons, if its reaction upon the latter can be 
neglected. Thus, for example, if we consider an alkali atom containing 
nm electrons of which n—1 form its inner core, while one can be treated 
as an outsider (although it actually ‘dives’ into the core), then the 
effect of the interchange of roles between this outsider and the core 
electrons is the same as if the energy of the external electron were 
decreased by the amount (383). Fock’s formula can be obtained by 
applying the variation principle to that part of the total energy 
W = J y*Hy dV which, besides the proper energy E of the ‘external’ 
electron, contains terms representing its interaction with the other 
electrons (whose motion is supposed to be given, i.e. to remain un- 
affected by this interaction). 

Taking for W’ the expression (371) and putting p = py-+p,, we easily 
get for the part in question the expression 


W, = | pr Ey, de + | 
+ | F(z, x')[po(x’, x')py(x, x) — po(X, 2") py (x, x’)] dada’ (384) 
where pil, x") = Yy(x)pF(x') 


is the contribution to the total density matrix p(x,x’) of the electron 
under consideration and y,(x) its wave function. The latter is to be 
determined from the condition 5W, == 0 (the normalization condition 
being now irrelevant). This leads to the equation 


[ E(x, p)+ Bo(x)—Aol¥, = Wiy, (385) 


where B,(x) = e* { P wv dzx' is the Coulomb energy of the electron 


in question with respect to the rest, while 4, is the operator of the 
exchange energy |including the physically irrelevant action of the 
electron upon itself which must be subtracted from B,(x)]. It has an 
unusual form, being defined by 


ae Pol®, x’) ’ , 
Agyte) = o { PT ya’) de’ (385) 
Now, as has been pointed out by Fock, the quantity 1/r(z, 2’), i.e. the 
reciprocal of the distance between the points z and 2’, can be considered 
(if we leave aside for a while the spin coordinates) as the matrix element 
with respect to x and 2’ of the operator —47/V?, where V? is Laplace’s 


444 SYSTEM OF IDENTICAL PARTICLES § 45 
operator. This follows from the fact that 


$(x) = we) ") 2’) dx’ 
is the solution of the equation 
V¢ = —4nf, 
which can be rewritten symbolically in the form 
4 
$= —al 


In applying this result to (385a) we must take care of the fact that 
1/V? operates on a function of x’ leaving x constant. We must accord- 


ingly come back to the original expression p(x, x’) = > W(x) b*(x’) for 


the density (where n—1 denotes the number of dleetrone in the core) 
and insert the operator 1/V? between the ;(z) and ¥*(z’) of the separate 
terms. 


2 
Since — SS V? = p*, we obtain the following expression for Ay +,(z): 
TT 


Agth(z) =~ > Wa)p-YEebbs(a), 


where x’ in ff (z’) and y,(z’) has been replaced by z in view of the fact 
that p-*, by definition, converts a function f(x’) of x’ into a function 
f(x) of z. 

If we now wish to consider the approximation corresponding to the 
classical mechanics we must treat p as an ordinary number, which 
enables us to rewrite the oe formula as follows: 


Agyy(z) = ap Lae a, x)xp, (x) 


and leads us back to the expression (383) for the operator Ap. 

We have hitherto made no explicit use of the spin variables which 
were understood to be included in x and p whenever they were neces- 
sary. It is easy to rewrite the preceding equations with an explicit 
notation for the spin variables. So long, however, as the dynamical 
effects of the spin are neglected, its only influence will be to double the 
maximum value of p(x, p) which is allowed by the exclusion principle. 
As has been: stated above, p(z,p) can be considered as the number of 
electrons per volume h5 of the classical phase-space (which, as we 
know, corresponds to one single spinless state in the sense of classical 
mechanics). Inasmuch as the inclusion of the spin allows each spinless 
state to be doubly occupied (by electrons with their spin axes in 


§ 45 APPROXIMATE SOLUTIONS 445 
opposite directions), the effect of the spin will be simply to increase the 
maximum value of p(x, p) from 1 to 2. 

If we consider a system of electrons, such as a complex atom, for 
example, in the normal state, i.e. in the state of lowest energy W, we 
can assume all the individual states of lowest energy to be doubly 
occupied, or, in other words, all that part of the phase-space x, p which 
corresponds to the least possible value of the energy to be filled with 
the maximum density p = 2 and the rest to remain quite empty. The 
shape of the boundary surface can be determined from the condition 
that p is a constant of the motion as determined by the energy K. This 
means, since 0p/ot = 0, that p must be a function of A, and that con- 
sequently the boundary surface we are looking for must be a surface 
of constant K. 

Now we have 

K = B(e,p)+ B(r)—& [ SEP) ap, (386) 
a ip—P'|? 
where r is written instead of x in order to indicate the fact that we no 
longer include the spin coordinate. Since within the part of the phase- 
space which comes into play 


p(r,p’) = const. = 2, 
the preceding equation is reduced to 
2e? dp’ 
Fos: +B(r)—5| . al» 386 a 
(r,p)+5(r)— p_p'4, (386 a) 
where the ae 


| {p—p’ = | | J ae ne —p;)? 


must be extended over all the saturated part of the momentum space 
p’ which is associated with a given point of the ordinary space r. 

In order to evaluate this integral we must make some assumption 
as to the shape of this saturated momentum space. We shall assume 
it to be spherical, its radius F, being a certain (for the present undeter- 
mined) function of r. We then get 


__ 2e°( Pr—|p|? 
A(r,p) = =| wer log 
We have further 
ptt.r) = 55 | o(x.p) dp = 57, PB (388) 


P.+p 
i | (387) 


8, P3. ; 
and consequently Bir) = 318° —<t ; (388 a) 


446 SYSTEM OF IDENTICAL PARTICLES § 45 
At the boundary surface we must have p = P, and consequently 


4¢? 
ire Te Fy: 
A(t, Pr) = = 
The equation of this boundary reduces accordingly to 
2 
A(r, p) == E(r, B)+ B(r)—* P, == const. (389) 


This equation serves to determine P as a function of r. It can be 
replaced by a differentia] equation of the Poisson type if we take into 
account the fact that B(r) is the product of the charge e of an electron 
and the potential ¢ due to a distribution of charge with a density 
(388) multiplied by ec. We thus get, applying the Laplace operator V? 
to the equation (389) and assuming E(r,p) to be of the usual form 
p*/(2m)+ U(r) with V?U = 0, 


Pp? 4e2 
V?(— 4- eee od 
(7 ae h P) : 


P?  4e? 3277e? 
75) el dle 2) ae a a Mots 
or (= 7 P) 4rre*p(r, r) a3 P (390) 


This equation (due to Dirac) is a generalization of the equation of the 
Thomas-Fermi theory which has been considered in Part I, § 32. It 
differs from the latter by the additional term —4e?P/h which represents 
the exchange effect (and also eliminates the self-action of the electrons), 
the electric potential or the density function being replaced by the 
function P. 


1D.¢ 


SECOND (INTENSITY) QUANTIZATION AND 
QUANTUM ELECTRODYNAMICS 


46. Second Quantization with respect to Electrons 

The reduction of the problem of the motion of a number of identical 
particles to that of a single one, carried out in the preceding chapter, 
involves a more or less rough approximation. A similar reduction can, 
however, be achieved in a different way, which corresponds to the 
method of copies which was sketched in Part I, § 20, and is connected 
with a quantization of the amplitudes of the waves representing the 
motion of a single particle. This procedure may be denoted as ‘second’ 
or ‘intensity’ quantization. 

This method was inaugurated by Dirac in connexion with the 
theory of light quanta for a system of particles which are describable 
by a symmetrical wave function. We shall, however, develop it 
in the first place for a system of electrons which will lead us to a 
generalization and improvement of the results obtained in the pre- 
ceding chapter. 

In describing a system of N electrons we have used hitherto only 
N individual wave functions ¥,(x), Y(x),..., %,(7) which enable us in 
the case of stationary states to account for the exchange degeneracy 
only. We shall now introduce an infinite set of mutually orthogonal 
and normalized wave functions of this sort (with spin), leaving their 
form undetermined for a while (they may, for example, represent the 
motion of an electron in the external field alone with negh ct of its 
mutual action with all the other electrons). We shall further combine 
them into sets of N functions and for each set form an antisym metrical 
function y in the same way as before. Instead of, however, identifying 
x with the exact wave function Q(2,, 2,...,%y,¢) describing the motion 
of the electrons, we shall define the latter as’ a linear combination of 


all such functions, 2 = TCX (391) 


and shall determine the coefficients C,, as functions of the time in such 
a way as to make Q an actual solution of the exact wave equation 


(4+ = 5) = (392) 


(which can involve the spin variables). The coefficients C,, satisfying 


448 SECOND QUANTIZATION § 46 


this condition are determined by the well-known equations of the per- 
turbation theory 


h dC, _ : 
a Qri ‘dt. _ p3 By Cus (392 a) 
where Hy = [ x Hy, dX (392 b) 


(the ‘integration’ over the coordinates X cf all the electrons being 
understood to include a summation over the spin coordinates). 

The indices n specifying the functions y can be considered as repre- 
senting the totality of the numbers 7, 7g,..., n,,... corresponding to the 
individual wave functions ¢,, Y,..., %,,... and equal to 1 if these func- 
tions are included in the set forming y,, and to 0 in the converse case. 
Thus n, = 1 if the function y, is contained in x, and 0 if it is not con- 
tained in it. We could also write more fully x, = x(1,%o,..., 2,3 X, t) 
and C,, = C(ny, ng,...,%,,-..;4). The numbers 7, may be denoted as the 
partition numbers, indicating whether the corresponding rth individual 
state is occupied by an electron or not. In calculating the matrix 
elements (3924) we can use the formula [cf. (364 a), § 44] 


Hy = | xt Axy AX = YN!) | Ph Hyy AX = J ep | $k HPS aX, 
(393) 
where ¢, and ¢,, are the factorized wave functions corresponding to 
a definite distribution of the electrons between the occupied states, 
for instance, $n(X) = Py (Li )by, (22) «Pry (Ty) (394) 
and bn(X) = Py (% pp, (Ze) -- Py, (Ty). (394 a) 
It will be convenient to assume for a while that the indices 7,, 79,..., Ty 


of the occupied states are arranged in the same order as the indices 
1, 2,..., N of the electrons, i.e. that 

Pi hg SS Vices (394 b) 
This means merely a certain (arbitrary) denomination of the N wave 
functions y forming the set under consideration. We could put, for 
example, so long as we are concerned with this particular set, r, = 1, 
%, = 2,..., fy = N. The order of the indices in the other sets n’ must 
of course be left arbitrary. So long as H has the usual form 


H => E(x;,p,)+ 22 F(x;,2,) 


the matrix elements (393) will vanish identically if the set n’ differs 
by more than two individual states from the set n (in view of the 


§ 46 SECOND QUANTIZATION FOR ELECTRONS 449 
orthogonal property of the wave functions 4). We must therefore 
distinguish three cases: 

(1) n=’, i.e. n, = nj}, for all values of r, or simply x, = x,-. In 
this case the matrix element (393) reduces to the value of the energy 
W already calculated in § 40. Putting 


{ Pry, dx = (395) 

and [f PROMO Fe, 2 Waele’) dada’ = Fray (395) 
we can rewrite it in following form: 

nn = D Et YX Frars—Fraar) (396) 


which is easily seen to coincide with (368 a). 

(2) The set n differs from n’ by the fact that one function, ¢,,(x) say, 
is replaced by another, ¥,(z), all the other factors in (394) and (394 a) 
being the same. We then get in a simiJar way, putting r; = p and 


i ca (ae 1; fas E, oor > (F, prip’r | ); (396 a) 
where the sub-subscript 7 has been dropped. 

(3) The set n differs from n’ by the fact that two functions ¥, (x,) 
and y,,(x,) are replaced by different functions (not belonging to the 
original sct) ia and y,.(x;,). We get in this case, writing p for r; and 
TAO Te Haw = Enaue—Foaate (O<G PA Pg Hy). (896) 
Let «, denote an operator which when applied to C, == C(ny, My...) Mps---) 
increases n, by unity if », == 0, that is, transforms 

Ct Mase RungctnMinacd 
into C(n,, No...) Mp1) 2% +1,2-41,---); if, on the other hand, n, = 1, the 
operator a, reduces C to zero. Let, further, af denote an operator 
which decreases n, by 1 if n, = 1 and reduces C to zero if xn, = 0. 
The coefficient C.. corresponding to case (2), can be written accord- 
ingly as afa,C,, and the coefficient C, corresponding to case (3) 
as atata,a,C,. It is now possible to write the equations (3924) as 


follows: 

h a, _K 

Oni dt K, Gus ($24) 
where 


K,, = 2 f+ 2% > Hon’ pp & iy t 
a. oo aa r8; ar) + 2 2 > (F; pr;p'r Frey) pe 


ps iF Pap'a Foan'y’ ‘ap Ou 0 a Mp’: 
22 ypeY T#Q 
3505.6 3M 


(397 a) 


450 SECOND QUANTIZATION § 46 
The summation over r and s includes all those individual states which 
are contained in the set , i.e. represented by partition numbers 7, and 
n, equal to 1. The summation over p and q is extended only over such 
of these states as are replaced by one or two other states in the set n’. 
Thus the indices p,q on the one hand, and p’, qg’ on the other, are not 
independent of each other. The expression (397 a) can be further simpli- 
fied with the help of the relations «,C,=0 if n,= 1, afC, = 0 if 
n,= 0. Applying the operator a, to C, and omitting all the argu- 
ments except n,, we thus get 
C(n, +1) ifn, = 0 
C =< r r ’ 
% O(m,) | 0 ifn, = 1, 
and 
C(n,—1) ifn, =) 
t —_ r r , 
a OG) 0 2, = 0, 

Under these conditions it is possible to represent the operators «, 
and a! as matrices from the point of view of n,, considered itself as a 
diagonal matrix i: 7 

m2 


a8 (398) 


with the characteristic values 0 and 1 or, what amounts to the same 


thing, as a onc-column matrix 7, = ( It should be mentioned that 


the difference 1—n, must be defined accordingly as the matrix (, 0) 


] F 
or (0 respectively. 


Regarded from this point of view the operators a, and a! are repre- 
sented by the matrices 


(0 1 . (0 0 
a = (5 op a= (1 i (398 a) 


satisfying the relations 


af a, = n,, a,a! = 1—n,. (398 b) 


Any function C(n, 7g,..., ”,,...) of the matrix arguments n,, or more 
exactly of their characteristic values, must likewise be dealt with as a 
matrix. Leaving all the arguments but n, aside, we can define it as a 
one-column matrix 


C(n,) = ait (399) 


whose elements correspond to the two characteristic values of n,. This 


§ 46 SECOND QUANTIZATION FOR ELECTRONS 451 
gives, according to the definition (398a) of the operators a, and af 


a, O(n,) = “ f at C(n,) = Hos (399) 


which is in agreement with the original definition of « and af. If 
therefore we agree to consider the partition numbers as the charac- 
teristic values of the corresponding operators (which will be denoted 
by the same letters), we can rewrite the sum of the first two terms in 
(397 a), corresponding to the proper energy of the electrons E, as follows: 


2» Bt 2 2 E, ‘pp’ ax}, o = pee Ps Ey of Opry (400) 


since for all Gils ak r and r’ which are not actually represented 
in the sum on the left-hand side of this equation, the operator a} «, is 
equivalent to 0. 

Turning to the other terms of (397 a) which correspond to the mutual 
energy of the electrons, we shall show in the first place that they can 
be collected together in a form similar to the last term with no other 
restriction imposed on the summation indices p, g, p’, g’ than the con- 
dition p < q. 

In fact the second term is casily seen to be obtained from the last 
if we put g’ = g = r and interpret “— product «! «, as the operator n,. 
Since this operator commutes with a (so long as p 4 g) we can write 
it on the left of «! and extend the stiganiation over all values of r which 
are larger than p, those terms which correspond to values of r not 
represented in n being automatically cancelled. 

It should be emphasized in this connexion that the order of the four 
factors a}! o,«, in the last term of (397a) is not taken at random, 
but precisely with a view to ensuring the inclusion of the preceding 
term under the condition g’ = g = r. It is easily seen in the same way 
that the first term containing F in (397 a) is obtained from the following 
one if we put p’ = p, or consequently from the last term if we put 
simultaneously p = p’ =r and g = q’ =8 with the one restriction 
r<6,ie.p<q. We thus get 


22 (Fern Eso) + 2, 2 p> (F, pr;p'r ony Oy 
an 2 55 > (Foava— Fpaan' 4 %q Xp’ (400 a) 


P<a, DFP, GAG 


= >> 2 p (Foap'g — A paia'n' op % Mg Xp 
P<@ Pp 
and consequently 


K,, = K= p> > Ey Oe 2, p > > (F, r6,7's’ | pe a! Og: Xp’, (400 b) 


452 SECOND QUANTIZATION § 46 
where the indices p and g have been replaced by r and s in order to 
indicate the fact that the summation can be extended over all values 
of these indices, the terms represented by non-vanishing partition 
numbers n,, n, corresponding to the state n being actually the only ones 
left. This is why we are now entitled to drop the index n for K. If instead 
of writing atafa, a, in the second term of (400b) we had written 
a! a! oa, it would have been impossible to include all the three terms 
of (397a) containing F in one. 

The second step in the simplification of the operator consists in the 
removal of the restrictive condition r <s and in the simultaneous 
unification of the positive and negative summands in the second term 
of (400b), representing the mutual action of the electrons. In order to 
carry out this simplification we must introduce instead of the a’s new 


operators Oe Hes Gh a Bact 


with an appropriate rule for chosing the upper or lower sign, so that 
we could write 


ata, = ala, (401) 
and alafa,a,=alafa,a, =alata,a, (401 a) 
atata,a, = —afalu,a, = —alata,a,. (401 b) 


This enables us to put 

DD Ew ate = LD By a, a, 
and = 
YEE Fare Fase lat of ay 0 


r<a or 
see Pe: Fray’y' Ip A My App 2222% rear A A,d, 


rcs or’ 


= oe aE arr’ Uy fafa, Get+ o>. *F, ory's U4 A! Ay a, 


‘ 
in view of the shies relation 
Fey ae | 
and the relations (401 a), or finally 
K = 2 2 E,,a a, 5 Oy +3 2 b3 > oF rar’a' ay a; ayy; (402) 


the summation being extended over all the values of the indices r,7’, s, 8’ 
without any restrictive conditions whatsoever, all the restrictions being 
carried out automatically. 

In order to define the operators a explicitly we must take into account 
the condition which has been stated at the beginning of this section as 


§ 46 SECOND QUANTIZATION FOR ELECTRONS 453 
to the arrangement of the indices 1,, 7o,... specifying the individual 


states in the set n: tr < Te <n. 


This condition has been used in all our preceding expressions for K up 
to the expression (400 b), and has been dropped only in the last expres- 
sion (402). 

Now the operators a must be defined in such a way as actually to 
enable us to get rid of this condition. This can be done, following 


Jordan and Wigner, by putting 

Gy, = 4%; at = v,al, (402 a) 
where vy, is an operator with the characteristic values 1 and —1 (i.e. 
equivalent to taking « with the -+ or — sign) which is defined as the 


productt all 
v, = TJ] (1—2n,). (402 b) 
s-1 


The separate factors in this product are themselves operators of the 
same kind as v, (the charactcristic values of n, being 0 and 1, those of 
the difference 1—2n, must be +1 and —1). The operators a defined 
in this way are easily seen to satisfy the conditions (401), (401 a), and 
(401 b), or the more simple conditions not involving the original a's: 


ata, = Nn, (403) 
a,a,+a,a, = atat+atat == 0 (403 a) 
and finally aja,+a,at = &,,. (403 b) 


We have in fact 

ad! a, = v0) 0, <= YN, 0, = Ht = Ht, 
since n, is represented by a diagonal matrix, just as v, is and therefore 
commutes with v,, whose square is equal to 1, i.e. to the unit matrix 


(0 4} 


Further, if 7 < s, we obviously have 
Op Xy = A a, 
(the case r == 8 is devoid of interest since the operator «,«, applied to 
any function C,, gives identically zero). On the other hand, 
A, A, = XV, AgVs = Vp AX, AV, = VA, A,Vy, 
since, according to the definition (402b), « commutes with all the 
factors in v,. It does not commute, however, so long as r < 8, with one 


t We slightly diverge hore from Jordan and Wigner by extending the product over 
8 to s == r—1 instead of « = r. 


454 SECOND QUANTIZATION § 46 

factor in y,, namely, (l1—2n,). Applying it to the latter we have 
a,(1—2n,) = a,—2a,7,. 

Now a,7, = (”,+1)a, if n, = 0, and 0 if n, = 1. In both cases we get 

a,(1—2n,) = —(1—2n,)a,, 

and consequently OY, = —vy,a, (7 <8). 

The preceding relation can be derived in a somewhat different way if 

we replace n, in 1—2n, by the product ata. We then get 


a,(1—2afa,) = a,—2u,(af a,) = «,—2(a, af oy, 
= (1—2a, ot)a, = [1—2(1—,)]a, 
according to (398b), which coincides with our previous result. 
Coming back to our original expression for a,a,, we have 


2, A, = Vy Og, Vg = —V, AyV yg; 
or, since the three operators v,, «,, v, commute with each other while 
v, commutes also with a,, 


UA, A, = —AyVyV, A, == —AgVyA-V, = —A,M,. 
The second relation (403 a) is proved in exactly the same way. 

In the case 7 == 7’ relation (403 b) immediately follows from the rela- 
tions ofa, = n, and a,at = 1—n, (see 398b). In order to prove it for 
the case r <r’ (or in general r + r’) we must use the fact that the 
operators a! and a, commute with each other just as the operators 


a, and a. do. We have further, if 1—2n, is written in the form 


(1—2n,) = —[1—2(1—7n,)] = —(1—2a, af); 
a}(1—2n,) = —ot(1—2a, at) = —[a!—2at(a, ot)} 
= —[o!—2(at x,)at] = —(1—2n,)at, 
so that aly, = —vyal (r <7’) 


as before, and consequently, since 
Oy Vp = Vp Opry 
a} ay =v, oxy Op Vp = Oly Vp at Ver = —~ Apr Vp Vp of = —a, a}. 

Now that the relations (403), (403a), and (403 b) are all proved, we no 
longer need to think of the auxiliary operators a, and v which have 
been used in their derivation and which depend on the physically 
irrelevant order in which the different individual states are numbered 
in the set n. The above relations are self-supporting, for they specify 
in a perfectly unambiguous way the operators a which serve to 
express the energy operator of our problem K. These operators can be 
represented by certain matrices, from the point of view of the matrix 


§ 46 SECOND QUANTIZATION FOR ELECTRONS 455 
formed by the totality of the partition numbers 7, 7»,..., in a way 
implying a certain ordered arrangement of the different individual 
states. So long as we are interested in one particular state only we can 
define the corresponding partition number 7, as a two-dimensional 
matrix (398) and represent the operators a, and a! by the same matrices 


0 1 ,_ (9 0 
a, = , a, = 
0 0 1 0 


as those representing the operators «, and af. The difference between 
them and the operators a,, a! becomes apparent when we take into 
account all the other states s (= 1,2,3,...). The general representation 
of the operators a,, at can be derived from (398a) by multiplication by 


unit matrices 6, = ' i referring to all the other states (s = r): 


0 1 
Xn, = Bra 8p (5 0) Bra X Bean Xoo Bh 


0 0 
at = 5, Bax XB,ax(F 9) Bra X 


the product M, » M, of the matrices M, and M, denoting a matrix M 
whose clements are obtained by combining multiplicatively the ele- 
ments of M, and M,. In order to obtain the general representation of 
a, and at we must replace the r--1 matrices 8,, 5,,.... 5,., by the 
matrices |—2n,, 1—2ng,..., 1—2n,_,. The matrices so defined 


01 
a, = -Brasly g)(1—2e-a)(1—2, (12m) 


0 0 
c= (1—2n)(1—2n).(2—2mp-a){ 0) 


can easily be verified to satisfy all the relations (403a,b). The totality 
of the numbers 7, 79,... must be represented accordingly by the product 
of the diagonal] matrices representing each of them: 

n= Ny XNgXK...- 
This operator product has, of course, nothing to do with an ordinary 
product of the numbers which give the characteristic values of the 
operators n, and which, as will be remembered, must satisfy the 


relations - 
2h, 
r-1 


The operators a, are not Hermitian, although the symbol at! preserves 
its meaning as the operator adjoint to a,, i.e. as the conjugate complex 


456 SECOND QUANTIZATION § 46 
0 0 
1 0 
pressed with the help of the Hermitian operators 


c= (1 ‘ o= (5 ‘) = (~% i: 
= 1 OP yo \-i of J 0 17 
whose products with h/47 represent the components of the electron’s 
spin (cf. § 29]. We have, namely, limiting ourselves to one particular 
iin o, = (4,441), 0, = i(a,—at), 
whence a, = i(o,—10,), at = }(o, +%0,). (404) 
Hence it follows that 

n, = ala, = }[o}--o;,-|-2(,0,—9,0,)], 
or, according to (253), § 29, 

n, = 4(0;-+o5+ 20,), 


of the transposed matrix @, = ( ) They can, however, be ex- 


or, since oj = 0, = 5, n, = 4(8+0,), (404 a) 
which agrees with the definition n, == (; i} It should be noticed 
further that ee ees 


and that accordingly 
vy, == tl 4(5—o.,), 
=1 


the subscript s in o,, serving to show that it refers to the sth state. 

We thus see that the energy operator K (402) can be expressed with 
the help of the familiar spin operators associated with the different 
states. This is natural if we remember that it is possible to represent 
the interaction energy of the electrons in connexion with the exchange 
effect with the help of the operators }(1-++-¢,-6,), as has been shown in 
§ 42 of the preceding chapter. The problem is complicated in the present 
case by the necessity of introducing the operators v, in order to ensure 
the anticommutation of the operators a, and a, (or af and a!) referring 
to different states (whereas the operators o, and 6, must commute with 
each other). 

The fact that the operators n, can have only two different charac- 
teristic values 0 and 1, which has been used as the basis of the above 
definition of the operators a,, can be considered as a consequence of 
the properties of these operators expressed by the relations (403 a) 
and (403 b) in connexion with the equation (403) which from this point 
of view serves simply for the definition of the operator n,. We have, 


§ 46 SECOND QUANTIZATION FOR ELECTRONS 457 
namely, multiplying the relation a,at = 1—n, on the left by a}, 
ata,a! = a!—a!n, 
or, since a! a, = n,, n,a} = a}—atn,, 
whence, by right-hand multiplication by a,, we get 
n= = n,—a} n,a,. 
Now a} n,a, == ata{a,a,, and according to the relations (403 a) we must 
have a,a, == afa! = 0. We thus get 
n2—n, = 0, 

whence it follows that the only characteristic values of 7, are 0 and 1. 

The preceding theory can be put in a still more significant form by 
introducing the expression 


¥(2) = Sard,(e) (405) 


Being an ordinary function of the coordinates of an electron (and 
eventually of the time), it is to be considered at the same time as 
an operator with respect to the amplitude coefficients C(n,, 2,...), which 
play the role of the wave function in the equation 

hd 


Sane 


with the energy operator K defined by (400b). 
Multiplying (zx) on the left by the adjoint operator 


W(x) = Dat YF (2), (405 a) 
and integrating over x (which includes as usual the summation over the 


spin coordinates), we get in virtue of the orthogonality and normaliza- 
tion of the function y¥,(z): 


[vty dz = >; ala, = 5 Ny. (405 b) 
r=1 r=1 


This equation is quite similar to that corresponding to the ordinary 
case of functions of the type (405) with amplitude coefficients a, defined 
as ordinary numbers. Replacing such numbers by operators satisfying 
the conditions (403 a) and (403 b)—or even the less restrictive conditions 
a,a, = 0, atat = 0, afa,+a,a! = 1—we obtain for the number of elec- 
trons associated with any individual state one of the two characteristic 
values 0, 1 of the operator a! a, = n, in agreement with the exclusion 
principle. The total number of electrons N can be defined accordingly 


as a characteristic value of the sum 2 n,, 80 that it appears in the role 
=1 


3595.6 3 N 


458 SECOND QUANTIZATION § 46 
of an additional] ‘intensity’ or ‘quantitative’ quantum number (ef. 
Part I, § 20). The operator i 
Ne > % 
r=1 

is easily scen to commute with the energy operator K in virtue of the 
relations (403a,b) and to represent accordingly a constant of the 
motion—which means that the number of electrons forming any parti- 
cular system is constant—as of course it should be. The operator K 
can itself be expressed in terms of the operator-functions (x) and its 
adjoint operator Y'(z) not containing explicitly the operator-coefficients 
a,. We have, namely, 


TY Ep ata, = [ ¥(e) BY (x) de 
and od 
LUTY Kare at al aya, = ff Yara) F(x, 2')¥(a' Vw) dede’, 


so that K can be written in the following form: 


c= J Yt (x) BY (x) dx + 4 i f Yt (x) Pt (x) F(x, 2/)Y (x) (x) dardz’, 
(406) 
which is somewhat similar to the expression for the value of the energy 
W given by the equation (371), § 44, if the densit-- matrix p(x, 2’) is 
replaced by the product Y'(x)(z’). The main difference between the 
two expressions lies in the fact that the exchange effect which is repre- 
sented by the negative term under the double integral sign in (371) is 
not present in (406) where this exchange effect is automatically ac- 
counted for by the properties of the operators Y(z). 
Putting F(z, 2x’) = e*/r(x,2z’), which corresponds to an ordinary 
Coulomb interaction between the electrons, and introducing further 


the operator ’ 
ole) =e f Px ¥(e') oP (406 a) 


which represents the electric potential due to a distribution of electricity 
with a density ep(x’, x’) = e¥t(a')¥(2'), 
we can replace (406) by an expression of still more familiar type, 

K = { ¥"(x)(B+deq)¥(@) de, (406 b) 


corresponding to the average value of the energy W for an electron 
moving in an electric field which consists of an external part (included 
in Z) and a quasi-external part, due, as it were, to its own field and 
represented by the electric potential with the extra factor }. It must 


§ 46 SECOND QUANTIZATION FOR ELECTRONS 459 
clearly be understood that no actual self-action of a single electron is 
implied by our theory, the commutation properties of the operator- 
functions Y being precisely such as to exclude any self-action. 

These commutation properties are easily derived from those of the 
operators a, and from the orthogonality and normalizing conditions for 
the functions ¥,(x). We have, namely, multiplying ¥(z) by ¥(z’), 


Yaya") = ¥ 4,4) E ay’) 
= LL aay pelea’) = — FE aye. be(ede’) 
that is, Y(ay¥ (a) 4+¥(x’)¥ (x) = 0, (407) 
and likewise, Yi (ayvt a’ y+Pi (a tax) == v. 
We have further 
¥1(x)¥ a") = FE ah ag yF(x)p,(2") 
aliens 2 2 Ug; Pp (x)Yp,(x) 4- Z = Sy5 Be («by (2"), 


whence Yi (a)¥(x')4-¥(x')¥1(x) = S(a—2"’), (407 a) 
where 5(z—x’) denotes the product of the Dirac 5-functions for the 
geometrical coordinates by dz if € and €’ are the values of the spin 
coordinates associated with the points z and z’. It should be remarked 
that the formula a!a, == n, is replaced in the present case by the 


formula (405 b) or 
| ¥t(r)¥(x) dx = N. (407 b) 


The functions ~,(7) which serve to define the operator V(x) have been 
left hitherto entirely arbitrary apart from the condition of being 
mutually orthogonal and normalized. The actual problem, which was 
put at the beginning of this section, was to find the coefficients C,, 
which determine the wave function Q = > C, x, describing the behaviour 


of the system of electrons under consideration in the configuration 
space. From this point of view the functions ¢,(z) play only an auxiliary 
role. 

But on the other hand, it is clear that the preceding theory can give 
results of real practical value only in the case when the separate anti- 
symmetrical functions x, form a good approximation to the functions 
Qy, which describe the stationary states of the system when the external 
field does not depend upon the time, or specific types of motion in 
@ given variable external field. Assuming the latter to be constant, we 
are thus led to the problem of determining the individual wave func- 
tions ¥,(2) in such a way as to make the functions x,, the best possible 


460 SECOND QUANTIZATION § 46 
approximations to the exact antisymmetrical wave functions describing 
the stationary states of the system. 

Let us consider a stationary state of the System as determined by 


the exact equation kKC=Wwe (408) 


h dC 


to which the general equation — 5 KC is reduced if the operator 
at 


_ om us is replaced by the characteristic value of the energy W, i.c. if 
=770 ¢ 


all the values of the function C(n,,n.,...) for which } n, = N are 
assumed to be proportional to e-'274*'4 just as in the case of an ordinary 
Schrédinger equation. We then get, denoting the amplitude of C,, by 
™y, the following exact representation of a stationary state (in the 


configuration space): Oy = > Cex, (kX). (408 a) 


where X denotes the totality of the coordinates of all the electrons. 

Now the equation (408) must obviously be equivalent to the varia- 

tional equation dW = 6 f QF. HQ dX -= 0 with Q,, written in the form 

(408 a), in conjunction with the condition ¥ C*C, -= 1. This varia- 
n 


tional equation can serve for the determination of the coefficients C° 
if the functions y,, i.e. the individual functions %,, are known. Or it 
can be used for the determination of the latter if the cocflicients C% are 
known. Assuming them for the moment to vanish for all the subscripts 
n except one, we get back to the self-consistent field considered in the 
preceding section. 

The question we were discussing above is thus reduced to the fol- 
lowing one: Is it possible to determine both the functions y, and the 
cocflicients C, from the same variational equation 5]/V = 0, where 
Wo sx § Q5- HQ dX? Such a determination is certainly possible for 
a function (408a) containing a finite number of terms; if only one of 
them is different from zero we get back to the problem of the self- 
consistent field already solved. 

However, the solution thus obtained will contain, as in the simple case 
just mentioned, a certain amount of arbitrariness in the form of the func- 
tions ¥,(x) (the latter being replaceable by any other set derived from 
them by a linear orthogonal transformation). This arbitrariness will 
increase with the number of non-vanishing terms and will become 
infinite in the limiting case of an infinite series (408a). So long how- 
ever as we are looking not for a formal but for a practical solution of 
our problem, we can deal with it as if the number of terms in (408 a) 


§ 46 SECOND QUANTIZATION FOR ELECTRONS 461 
were finite; this procedure will ensure the most rapid convergence of 
the series obtained in the limiting case. Dropping the affixes IV and 0 
in (408a), we have 


W = { O*HQdX = >> CTC, | xk Wye AX. 


that is, We SY Ct Hy Cys 


‘ 1 un 
nn 


which according to our previous results can be rewritten in the form 
W =: > C* KC,. (409) 


The problem we have considered hitherto was equivalent to the varia- 
tion of the coefficients C’,,, the operator K being fixed. It could thus 
be expressed by the equation 


> 8Ck KC, 4+ > C* KSC, = 0 
along with DOC. C= > CLEC, == 0, 


n nu a 


1 n 
which brings us back to the equation (408). The next step which we 
must undertake in order to secure the best possible approximation con-— 
sists in the variation of the functions ¢,(z), i.e. of the operator K which 
they define. We thus get the additional equation 


> CLEKC, = 6. (409 a) 
With the help of the expressions (406) and (406 b) this is easily reduced 
to the following form: 
¥ C2 (5 | ¥"(@)[E+ep(x)]¥(x) de}C, = 0, 
provided we consider the variations of the operators Y(r) and Y'(z) as 
independent of each other, apart from being subject to the condition 
§ 8¥1(x)-¥(x) dx = 0 [and f ¥t(x)-5¥(x) dx = 0]. Let us consider C and 
Ct as a one-column or a one-row matrix and introduce the functions 
oO => VC. oot = ctyt, (409 b) 
The preceding equations can then be rewritten as follows: 
f Scot (x)[ E+ ep (xr) Joo(x) dx = 0 
and J Se0'(x)-eo(x) dx = 0, 
where dof = Cts", 
Applying Lagrange’s method of undetermined multipliers we thus 
obtain the equation [E+e9(x)—W]o(x) = 0 
or [£+e9(xz)—W]¥(z)C = 0, (410) 


462 SECOND QUANTIZATION § 46 
where (x) is defined by (4068) and W denotes a constant—the charac- 
teristic value of the energy we are looking for. This equation can serve 
for the determination of the functions y¥, so lorig as the coefficients C,, 
are supposed to be known for any choice of these functions. The equa- 
tion (410) is a good illustration of the method of double quantization, 
as it is operational in the double sense of Y(x) operating on the matrix 
C and £+ep(x)— W operating on Y(z). 

Assuming the first of these operations to be understood, we can 
rewrite the preceding equation in the form 

[E+ep(x)—W]¥(x) = 0 (410a) 

as for an ordinary wave function, with the only difference that the 
additional potential energy ep(z) is itself dependent upon the function 
Y. It should be mentioned that this dependence can be expressed by 
the differential equation 


V2p = —4re¥'(x)¥(z), (410b) 


which is equivalent to the integral expression (406 a) for the operator 9. 
This circumstance can be used, as will be shown later on, for a very 
important generalization of the theory in the sense of taking into 
account the exact electrodynamical laws which govern the interaction 
of the electrons. 


47. Intensity Quantization of Particles described in the Con- 
figuration Space by a Symmetrical Wave Function (Einstein - 
Bose Statistics) 

The reduction of the problem of a system of identical particles to that 

of a single particle—corresponding to the method of copies (Part I, § 20) 

—has been considered hitherto for the case of electrons—or more 

generally such particles as in the method of the configuration space 

are described by antisymmetrical wave functions. We are now going 
to consider the same question for the case of particles which belong 
to the symmetrical type, and conform accordingly to the statistics of 

Einstein-Bose (for instance, «-particles, hydrogen atoms considered as 

elementary particles, etc.). 

Let us start as before with a set of N different individual states 
specified by the mutually orthogonal and normalized functions ¥,(z), 

2(z),..., B(x). Introducing the factorized wave function 


P(X) = py (Ty )yoo(Xe).-.by (Zp) 


we can define the symmetrical wave function describing the whole 


§ 47 PARTICLES OF SYMMETRICAL TYPE 463 
system by the formula ses cs 5 Pé, ati 
Tan 2 


which differs from the corresponding antisymmetrical wave function 
by the absence of the sign-factors ep. This is of course a slight simpli- 
fication so long as we are considering a set of N different individual 
states. We get in this case from the variational equation 

8 { x*Hy dX = 0 


in conjunction with the conditions 


[ ote (a) dz = 8,, 
the following system of equations for the functions ¥, [corresponding 
to the method of the self-consistent field, cf. (366)], 


(B+ B-AsWila)+ % (Anat Anca) = 0, (411 a) 
the functions B(x) and A,;(x) being defined by the same formulae, 


(365), (365 a), as before. The energy (or its probable value) is expressed 
accordingly by the formula 


JN!) f o*Hy dX = § [ $*HP¢ dX, 
which gives , 
W = Ff dt o[(B+ 1B dbla) +1 ¥ Andale]. 
that is, rs 
W = ¥ (Ei. | Ascdet(@) dx) + 
+4 [ F(x,2'olx,2)p(a’,2")+ |p(a,2’)|*] dade’ 
Wms ZB ff Fez’ ibsle/*i(e') ? ded’) + 


+4 ff F(x, x')[ p(x, x) p(x’, x')+ |p(x, x’) |?] drdx’, (411 b) 
We thus see that in the case of symmetrical wave functions the density 
matrix p(x,x’) cannot replace the separate wave functions. Using the 
notation (395a) for the matrix elements of F and affixing the index 
n to x, we can rewrite the preceding expression in the form 


Hy, = | xt Hx, aX = SB, —Fere)+ DS (Firat Fraser) (412) 


similar to the expression (396) which corresponds to the antisymmetrical 
case with a similar condition as to the arrangement of the indices 


11,7)... in 6 = yf, (Zy)y, (Z)..-h-y(Zy). 


or 


464 SECOND QUANTIZATION § 47 
If x, ix replaced by a function x,, differing from yx, in that a single 
individual function x, is replaced by y,,., we get likewise 


if, ye OS Fyyt 2. Ping two): (412 a) 
rip 


If finally two of the functions serving to construct yx,, 4, and y, say, 
are replaced by two new ones different from each other and from all 
the other functions in the original set. we get 
|) P npg t ped y (p <q). (412 b) 
Let us now pass to the general case, which has no parallel in the theory 
of antisymmetrical functions x, where certain individual states are 
multiply occupied, so that, for instance, each function y,(a) occurs 7, 
times (”, >> 0) in the set specified by the index n, where the sum 
> n, must of course he equal to the number of particles. 
The formula (411) will still be valid in this case, except for the 


no! 
normalization factor which must be replaced by | (=) if only 


effectively different permutations P are included in the sum (411), i.e. 
such permutations as interchange particles associated with different 
individual states. Thus, leaving aside trivial permutations, we can 
define the normalized symmetrical wave functions by the formula 


] “ 
1 Se, P4,,, (413) 
. V(Gn) 2 
! y 
where = = ce (413 a) 
n,'noing!... TI n,! 


We then get, instead of (412), 
Pits = > (Le — Frere) + pe. N,. 1,( Frere | em (414) 


The calculation of the matrix elements corresponding to (412a) requires 
a little more care. 

We must in the first place determine the number of times a given 
variable, x, say, will be met in the sum p P¢,, associated with a certain 


function ys. The number is obviously equal to 
N—1)! 
(n,—I)' T] 7! V9" 
s#p 


The function ¢, will differ from ¢, by the fact that it will contain 
Np = 2,—1 factors wy, and n/, == n,+1 factors %,,. Now the matrix 
element of HE, with respect to the functions Pd, and P’¢,, will be 


§ 47 PARTICLES OF SYMMETRICAL TYPE 465 
different from zero only if the variable x, is shifted from y, to y,,, all 
the rest remaining in their places. There will thus be n,g,/N terms 
equal to (E,),,,._ Now since H contains the sum of WN terms £,, corre- 
sponding to the proper energy of all the N particles, the matrix element 
E,p Will appear in the expression 22 P¢,,H1P'd,, just ng, times. 
The coefficient of E,,,, in H,,,, will thus hs 


Mee ny, (82) = vay glgrt 


WGn In’) Iu’ 
The same argument apples to the second term in (412a) which corre- 
sponds to the mutual energy of the particles and must besides be 
multiplied by n,. We get accordingly, instead of (412a), 


Baye = (ny) (tp + Iz ‘pp’ + p> n,( Fonprt Fi eae} (414a) 
By a similar argument we obtain, ssilear of (412b), 


Hye = Vy Vga (Ny +) l(ng +1) Foqivg tag’): «(414 b) 


If we substitute these expressions in the equations 


nn’ 


ene ae 2, 4 Cy 


which determine the coefficients of the expansion Q = $C, x, and 
n 


introduce the operators a, already used, we can bring them to the 


standard form h dC, ‘; 
ins rue RG 
271 dt 


with the following expression for the operator K: 
 — p2 n, E,,+ 2, > Vn, (Vny + IE,» of, ap — 


ae = n, Piet 2 > n, 14(Frsret feat 
+ 23M (np 1) me, pipet Forrpo, Xy + 
7 222 > My Sie AV (tg +1) Fogpe + Fpaa'n' )% % %y 


pt re rn 
or 


K= 24, n+ 2 LaF ppt Wy Xp Op NVity + 22 1,(%,—5y5)( Frere t+ Freer) + 
- EVI AE rip’ t | Wn p ah, Oy Vitg + 


r>p p 


+ paps > 2 (F, agit +#F paw) VMp a, Vn, of Og Ving: Xp Nn. 
P< P FY; 
3595.6 + ie) 


466 SECOND QUANTIZATION § 47 
This result can further be simplified if we replace the numbers n, by 
operators, represented by the diagonal matrices 


0000 
o100. 

n=40 020. (415) 
000 3 


and represent accordingly the operators a, and a! (from the point of 
view of n,) by the matrices 


010 0 esl 00 0 0 

001 0 | 10 0 0 eis 
a=10 0 0 1 ~ ear 0 to 6 «py (415 a) 

00 0 0 001 0 


Mo a A ew So - 
which are a generalization of the matrices used to represent the opera- 
tors n,, x,, and of in the antisymmetrical case. We can then put, since 


0 0 0 
va al? } 0 
oe Te OE , 
Fd ‘ 
= vn, af a, Vn, 
where 
0 0 0 0 
Oov7l 00... 
vn, = 40 OvV2 0 .. «S$, (415 b) 
0 0 0 v3 


co 
. 


and combine the first two terms of K into a single one, 
2 p3 E,,, vn, ot Oly’ Nn,-, 


the summation over r and r’ being unrestricted (i.e. vanishing terms 
being cancelled out automatically). 

The other three terms corresponding to the mutual energy F can 
likewise be combined into a single one, 


x = (F, Pap” +f, adn’) Mp alt, ng of Og: ng Op’ Vy, 
the second term corresponding to the case q’ = g and the first to the 


case g’ = g;p’ = p. It should be noticed that the term n, F,.,, is sub- 
tracted automatically in virtue of the relation n,a, = a,(n,—1), which 


§ 47 PARTICLES OF SYMMETRICAL TYPE 467 
reduces the product 


Vn, ot Vn, ata, Vn, a, Vn, = Vn,atn, a, Vn, 
to Vn, of o(n,—1)Vn, = n,(n,—1). 
It now remains to introduce, instead of the operators n,, «,, and af, 
combined operators which can be defined by the formulae 


b,=a,*n,, bt = vnat (416) 
or by the relations 
b}b, = n,, 6, b} =.n,+1 (4164) 
following therefrom, and which will be subject to the commutation 
conditions bb bb, = 0 
(417) 
btbt—btbt = 0 


b, b}—btb, = 4,, (417 a) 
(the latter being in agreement with (416a) in the case r = 8s). With 
the help of these operators, which are quite similar to the operators 
a,, a} [differing from the latter by the sign only in the commutation 
relations (417 a, b)], the operator K can be written in the form 


K= 2 b. E,btb,4+-3 > pa > > Freya 1 85 by b, (417 b) 
r fr rs Se 


which can be obtained from (402) by replacing the a’s by the b’s. It 
should be noticed that the order of the two last factors in (417b) is 
irrelevant [while it is very important in (402)] since they commute with 
each other. 

The commutation relations (417 a), just like their analogues (403 a) 
and (413b), are actually self-supporting and can be used to define 
the operators n, by one of the expressions (416). The fact that the 
characteristic values of these operators are equal to 0, 1, 2.... can be 
considered as a consequence of the relations (417) and (417a). We 
need not repeat in detail all that has been said in the preceding section 
about the operator N=¥n, 

: 


representing the total number of particles, and the functional operators 
Ye) = Oi pple), lle) = Fotv, (a, (418) 
It need only be stated that they satisfy the relations 
y(z)y(2")—Y(2')y(z) = 0 
Ve") —Vt@Wy"(a) = 0 
play" (z’)—y" (2 )y(x) = 8(e—2’) (418b) 


(418 a) 


468 SECOND QUANTIZATION § 47 
and can serve to express the energy operator in the form 


K = | yt (x) Ey (x) dz +4 | J yt(x)yt(a2’) F(a, 2’ )y(2x')y(x) dxdz’, 


the order of the first two or of the last two factors in the double integral 
being irrelevant. In the case of a stationary state of the system of 
particles the equation of motion reduces to the form 

KC=W.C 
which can be derived from the variation principle § } C* KC, = 0 in 
conjunction with the condition } C*C,, = 1. The same principle when 
applied once more to the operator A itself, i.e. in the form 


¥ CESK. = 
leads to a double operator equation 
Of E+ [ Peez'yt(e'e') de'|y(a)C = 0 


for the determination of the functions y,(z). 
In the special case when there is no interaction between the 
particles (F' = 0) the transition from the equations 


h de 

aa Sos) Mae eee ple ¢ 
2Qnt dt » Bre Cr — 

describing the motion of a single particle, specified by the energy 

operator £ and the wave function (x) = Pa c,%,(x), to the equations 


h dc, 
a oy = & Han ‘Ci (419 a) 


for any number JN of such particles (4 = $ z,) can he carried out, 
re} 


according to Dirac, in the following way: 
The right-hand side of the equation (419) can be defined as the dif- 
ferential coefficient with respect to c* of the expression 


B= >> chk, cy (420) 


which represents the probable value of the energy in the state specified 
by the wave function ¥(z) = > c,4,(z) We thus get 


_ Adc, oF 
2mi dt = ac®’ 
h de, 08 


and in a similar way gy a 
WT 7 


§ 47 PARTICLES OF SYMMETRICAL TYPE 469 


If we now put ¢, = —Q,, S gs = P, (420 a) 
271 
h 

or cr = ¥Y,, —C, = | (420 b) 
2rrt 


these equations can be rewritten in the form 

dP/dt = —aF/0Q,,  dQ,/dt = dE/aP, (420 c) 
i.e. in the standard canonical form of the classical equations of motion. 
The variables Q, and P. can be identified with the generalized co- 
ordinates and momenta, the Hamiltonian £ being a bilinear function 
of them both and the number of degrees of freedom being infinite. 
Let us now pass over from the equations (420c) to the corresponding 
wave-mechanical equation 


(421) 


where w is the wave function, or probability amplitude for given 
values of the coordinates Q, the classical momenta P. being replaced 


. ; h @ 
by the differential tors —-. ... 
y ifferential operators ~~ . Q, 
(420) directly over into the quantum theory considering the variables 
Q and P as operators (matrices) which satisfy the commutation relations 


Q, Q,— Q, Q, a P.P.—P. P. = 0 
P.Q,—Q, P, == ) 


Or let us take the equations 


1 Sei 
Replacing here P, and Q, by their expressions in terms of ¢, and c¥, 


we get 


* p* * pm __ — 
Ce Cp — CCl = C,C,—C,C, = 0 | (421) 


ct Cy — Cp CF = —6,, 
These relations are equivalent to the wave-mechanical relation 
) 


« @ 
Cc 8) See a? 
ocr 


r c ? oO r aaa 


(421 b) 
which follows from P, = oe We thus see that the coefficients 
7 Tr 

c and c* satisfy exactly those conditions which have been established 
above for the operators b and 6‘, and can accordingly be identified 
with the latter. 

The application of the quantization process just shown (‘second 
quantization’) to the coefficients c, c*, ie. their replacement by the 
operators b and 51, thus Jeads us directly from the equations (419) 


(which with their conjugate complex can- be considered as a system 


470 SECOND QUANTIZATION § 47 
of canonical equations in the classical sense) to the ‘wave-mechanical’ 
equation (421), that is, 
a) h @ 
EB. obi wc. a oe ol 
b > Erb at - Oni Ob 
or the equivalent (operator) equation 


[> ps E,,b! by Jw = W.w. 
This is no other than our previous equation 
KGC=W.C 
with w replacing C and with an operator A of the form 
R= pa E,, of by, 


which corresponds to a system of identical particles describable by 
symmetrical wave functions in the configuration space without any 
interaction. 

In other words, the quantization of the equations (419), describing 
the motion of a single particle, leads us to an equation describing the 
motion of any number of such particles—provided they conform to 
the statistics of Einstein and Bose. The actual number of these particles 
is equal to one of the characteristic values of the operator 


N = > dtd, 


and remains a constant of the motion since N commutes with A. The 
motion of the whole assembly of particles is described by the operator- 


ee v(t) = ¥ byp,(2) 
with the help of which the energy operator can be written in the form 
K= | vt@Ey@) dx. 


An exactly similar scheme can be applied, according to Jordan and 
Klein, in the general case of a system of identical particles of the 
‘symmetrical type’ interacting with each other, if this interaction is 
represented by a ‘quasi-external’ potential energy of the form 


4V(x) = 34 p(x’) F(x, x’)yb(x') dz’, 


the operator of the proper energy being replaced accordingly by 
E+4V. We then get, putting ¥(z) = > c, 4,(z), 


V(z) a 2 2 Voy C3Cy, 


§ 47 PARTICLES OF SYMMETRICAL TYPE 471 
and consequently 


K = [ p*(a)( B+ W(x) de 
= f o*@) By) dx +4 f | b*(x)b*(ax’) F(a, x’ )yb(x)yb(x’) dadz’, 
or K = Sd E,cre rhe eee Entities ©; 


with 


| = (Vie dr’ = ff or (x)pF(x’ ) F(z, x’ Seb, (X)by(2" ) dxdx’. 


It now remains to replace the numerical coefficients c and c* by the 
operators 6 and 0¢ in order to obtain the energy operator K corre- 
sponding to the problem of many particles. 

It should be mentioned that the ‘quasi-classical equations’ for the 
coefficients c can be written in the general case in the sare canonical 
form (420b) as in the special case of no interaction, and that the transi- 
tion to the quantum (or doubly quantum) equations can be effected as 
before by treating the coefficients c* as the operators —@/éc, (or c, as 
o/dc*). 

The preceding scheme for carrying out the process of second (intensity) 
quantization could be applied in principle to the case of particles of the 
antisymmetrical type just as well as to particles of a symmetrical type— 
namely, by substituting the operators a instead of the b’s for the coeffi- 
cients c. It would, however, be impossible in this case to consider the 
conjugate complex coefficients c* as differential operators —@/dc and to 
repeat with regard to the quasi-classical equations for the c’s and c*’s 
the same process which leads from the classical equations of the motion 
of a particle to the wave-mechanical equation. 

The operators 6, and 6! are written by Dirac in the form 

b, = ef27Olhn, = bt = Vn, eter bdh, (422) 
corresponding to the usual expressions for the coefficients c,, vn, 
playing the role of the modulus and 276,/h that of the argument or 
phase angle. It follows from a comparison of (422) with (416) that the 
operators e278! and e-!27%/t are no other than the operators a, and at 
considered before. Hence it follows. that the operators 0, can be 
represented from the point of view of the operators n, by the formula 


0, et oe (422 a) 


We have in fact, applying the operator 


@ 
wx, == ef2n0Jh — gildn, — > 1(@\* 
: a ki \on, 


472 SECOND QUANTIZATION § 47 
to any function of n,, 


elm fin.) = : “= = f(n,+1) 
by Taylor’s theorem, and in a similar way 


; —I)}* a 
e~i2n6thf(n,) _ e-4nF(n, ) = a i = f(n,—)) 
If instead of considering 6, and 6! from the point of view of n, we 
consider b, and », from the point of view of b!, we get, as has been 


shown before, P 

b, = abt (422 b) 
and consequently n, = pe Bor Replacing 5! by 6, as the basic quantity, 
we get likewise 2 

bt = — 2, (422c) 
and noe xr. 


Representations of a similar type are not possible in the case of the 
operators @ and at. 

Just as in the latter case, the operators b and b‘, which are not Her- 
mitian, can, however, be reduced to Hermitian operators p and q by 
means of the relations 


b=(q+ip), bt = h(q—ip), (423) 


which correspond to the relations (404). 
The operators p and qg are represented by the matrices 


Pat OF Das 0 -ivl1 0 0 

vl OvV2 0... il 0 —iv2 0 

q=10v2 043...1, p=io iv2 0 —ivB... 
I 0 v3 0 | 0 oO iv3 0 ‘| 
a ee a ee Re eee EE ele ew ce Me 


which follow at once from (415a), (415b), and (416), and are easily 
seen to coincide with the matrices representing the coordinate and the 
momentum of a linear harmonic oscillator (cf. Chap. III, § 13). Their 
non-vanishing matrix elements can indeed be written in the form 


Qnn-1 —Un-13n = vn (4238) 
Pan-1 >= —Pnr-190n = —ivn 

(where the index 7 has been dropped), and differ by certain propor- 

tionality coefficients only from the expressions (88 a) derived in § 13. 


§ 47 PARTICLES OF SYMMETRICAL TYPE 473 
From (423) we get 
b'b = n = }[p?+-9?—1(pq—9p)] 
bb! = n+1 = Up?+q°?+u(pq—gp)], 


whence PI—qp = =. (423 b) 


This reduces to the usual relation PQ—QP = h/2mi between the 
momentum FP and the coordinate @ if they are defined as ./(hw/4)p 
and ./(h/47w)g respectively. With the help of the preceding relations 
we find the following expression for n: 


n= }(p?+g*—2), . (423 ¢) 
which can be rewritten in the form 


4(P?+ Qu?) = hin +h) = 


corresponding to the quantized values of the energy of a harmonic 
oscillator with the frequency v = w/2z, n playing the role of the quan- 
tum number. 

These results bring us back to the elementary theory of the quantized 
waves which has been sketched in Part I, § 20, with the trivial dif- 
ference that we do not have to worry about the half-integral energy 
values of the harmonic oscillators representing the different states, 
since it is not their energy, but the quantum number n which gives 
the number of particles associated with the corresponding state. It is 
of more importance that we have now obtained an exact and general 
expression for the energy K of the system of particles in terms of the 
auxiliary variables 6,, b!, whereas it was assumed before without 
sufficient justification that this energy is simply equal to the sum 
5 E,n,. In reality it reduces to this expression in the special case only 
r= 
of no interaction and for a special (though of course most natural) 
choice of the wave functions ¥,, as corresponding to the stationary 
states, specified by the energy operator EH (H, = £,,). In this case the 
energy K can be expressed as a simple function of the Hermitian 
variables p and q, namely, 

K =} 34, (p*+q*—2). 
Their introduction in the general case instead of the variables }, and 
bf would, however, lead only to a useless complication of the theory. 

It is interesting to find the harmonic oscillator variables replaced in the 


case of electrons (or any other particles described by antisymmetrical 
3505.6 3P 


474 SECOND QUANTIZATION 8 47 
functions) by the spin variables o,, ¢,—a fact which could hardly be 
anticipated in the early development of the theory of quantized waves, 
given in Part I. 


48. Interaction between a ‘Doubly Quantized’ System and an 

Ordinary System: Application to Photons 
We have considered hitherto a system of identical particles, with or 
without interaction, in a given external field of force (specified by the 
potential energy U,(x) or the operator Z). We shall consider now the 
more general case of such a system in interaction with some system 
of a different kind which will be described to begin with in the usual 
way, i.e. by giving the coordinates of all the particles constituting it. 
The energy H of the combined system A 4-8 will consist of three parts: 
the energy of A taken alone (H,), that of B taken alone (H,,), and 
their mutual energy M = H,,;, which can be considered as a pertur- 
bation. 

The method of the ‘intensity quantization’ discussed in the two pre- 
ceding sections can easily be extended to the present case if the wave 
function Q describing the whole system in the method of configuration 
space is written in the form 


Q(X, ¥) = oY, t)x,(X), (424) 


where A and Y denote the totality of coordinates specifying the corre- 
sponding system, while w, denotes a symmetrical or antisymmetrical 
function of the coordinates x,,7.,... of the particles constituting X, 
according to the nature of these particles. Substituting (424) in the 


wave equation 
2 oo: 
2m dt 
we obtain a system of Bare 
oH aE = 24, 


of the same kind as before, with the only difference that Z,,,, must 
now be treated as an operator with regard to coordinates Y, and the 
‘coefficients’ w,, as functions of these coordinates and of the time. 
Introducing the individual states y, (r = 1,2,...) serving to define 
the functions x,, we shall thus obtain an equation of exactly the same 
sort as before for the coefficients w,,, considered as functions of the 
partition numbers 7, 7,..., 7,,..., of the coordinates Y, and of the time, 


§ 48 INTERACTION BETWEEN SYSTEMS 475 
with the energy operator > EH; increased by H,, and by the interaction 


energy M of A and B. Putting 
M =} V(z,,Y), 


where, in view of the identity of all the particles of A, V is the same 
function for all of them, we must simply add to the energy operator 
k, of an individual particle the function V(x, Y). We thus get the 
following equation: 


{Hp + Bs BS [E+ Vw Y)|ct C+ j > > > Zz Fare’ C; CiO,C,\w, 
ho 


” ami a” 
where the operators (’ stand for a or for b as the case may be. 
It has been shown in Chap. VII, § 39, that it is possible to treat one 
part, B say, of a complex system A+B as a complete system by 
treating all the quantities referring to this part as matrices with ele- 
ments defined with respect to the different stationary states of A taken 
alone. This result has been proved by using for the function Q describing 
the whole system just the expansion (424) with the important restriction 
that the functions y should be exact solutions of the equation 


(425) 


Hix = 4x. 

This treatment can be conveniently applied to the present problem 
only when there is no interaction between the particles of A and when 
the individual functions ¥,(z) are exact solutions of the equation 
Ey,(x) = E;y,(x). In this case the symmetrical or antisymmetrical 
functions x,(X) will also be exact solutions of the equation 


Hy Xn = H4 Xn» 
N 
where H, = > E,,, and the theory of § 39 will be wholly applicable to 
i=1 


our problem. 
This application is derived directly from the equation (425) if we put 
F = Oand L,,. = 5,,, H;. Denoting further the sum > £; C!C, = ¥ Ejn, 


by W,, and putting wn = wl e-BUW agllh, (425 a) 


, h @ ’ 


This equation coincides with the equation (3294) of § 39 if the operator 
of the interaction energy M is defined as > > V(Y)CIC, As a 
rr 


matter of fact, the result of its application to the function w, can be 


(425 


476 SECOND QUANTIZATION § 48 
written in the form > M,,,,w,, where n and n’ denote two sequences of 
the partition numbers 7,, %.,...,”,,... and ;,75,...,%,,--., differing from 
each other (as in the previously considered case) by one of the numbers 
in the second sequence being greater and another less by 1 than the 
corresponding numbers of the first sequence. In other words, the 
matrix components of the interaction energy M appearing in (425b) 
are taken with respect to collective states of the ‘ignored’ part A of the 
system A-+ B which differ by just one particle jumping from one indi- 
vidual state to another, or, in other words, by a one-quantum jump in 
opposite directions of two of the quantized partition numbers 7,, 9... 
which specify the states of A. 

The system B can in its turn consist of a number of identical particles 
of a different kind from those constituting the system A (for instance, 
A may be a system of photons or protons and B a system of elec- 
trons). In this case it is possible to apply the method of intensity 
quantization to the two systems simultaneously, by defining the func- 
tions w,,(Y,¢t) in (424) as symmetrical] or antisymmetrical combinations 
of certain orthogonal and normalized functions ¢,(y), $2(y),..., $-(y),.-- 
describing a sequence of stationary states of the separate particles of 
B. We can then take the equations (425b) as our starting-point and 
transform them by putting 


w,(Y, t) = 2 Crm(t@nm( Y), 


where w,(¥) depends (symmetrically or antisymmetrically) on the 
coordinates Y only. We can also—and this is perhaps a more natural 
procedure—carry out the two quantization processes simultaneously, 


starting from the original equation — = qe = HQ and putting 


Q(X, Y,t) = 82 p3 Pre Y)x,(X). (426) 
We thus obtain an equation of the following form: 
h 2 
(L+K+H)C = — ==, (426 a) 


where L and K are the quantized energy operators referring to the 
two systems A and B taken separately, while M is the operator of their 
interaction energy. If A is antisymmetrical and B symmetrical, we can 
use for L and K the expressions (402) and (417 b) respectively (affixing 
the indices xz and y to the operators HE and F in order to distinguish 
the particles of the two sorts), whereas the operator M is expressed in 


§ 48 INTERACTION BETWEEN SYSTEMS 477 
this case by the formula 


M = FD Dd Dd Ure gq A a, bf 6,,, (426 b) 
rrs 8 


where v(x, y) is the interaction energy between one particle of the sort 
A and one particle of the sort B, and 


mene = [f PAPE). Wb (ZWely) drdy. 


In the equations (426a) C = C_,,(t) is to be considered as a wave 
function whose arguments are the partition numbers m, and n,, or 
rather the corresponding operators, defined as 61 b, and aja, respectively. 
These results can be generalized further for the case of thrce or more 
systems of identica! particles, for instance electrons, protons, and 
photons, interacting with each other. 

We are now going to consider more closely the particular case of the 
photons, i.e. light waves, in interaction with an ordinary material 
system, which for the sake of simplicity we shall suppose to consist 
of a single electron, forming with the fixed source of the external 
field in which it moves a hydrogen-like atom. The peculiarity of this 
problem lies in the fact that photons cannot actually be treated as 
ordinary particles. As has been emphasized in Part I (§ 24) the analogy 
between light and matter has a very limited scope, and the notion of 
photons must be considered as a useful fiction of the same sort as that 
of ‘phonons’ (sound-quanta). In applying this fiction to the interaction 
between light and matter we must remember in the first place the fact 
that the number of photons does not remain constant, photons being 
created in the act of emission and destroyed in the act of absorption. 
This fact excludes the possibility of describing a system of photons by 
the method of configuration space. Under such conditions a strict 
application of the intensity quantization scheme devised for ordinary 
particles to the case of photons is impossible. It is nevertheless possible 
to apply the final results to this rather fictitious case, thanks to the 
fact that we do not have to introduce any interaction between the 
photons. We must, however, suitably define the expression for the 
mutual energy between the photons on the one hand, and the material 
system (electron, atom) on the other, in terms of the partition numbers 
which describe the distribution of the photons over the different states, 
and, moreover, provide in a physically irrelevant way for a formal 
conservation of the total number of photons. 

This latter circumstance can easily be achieved by introducing an 
additional state of zero energy corresponding by definition to an actual 


478 SECOND QUANTIZATION § 48 
absence of the photons. Emission or absorption of a photon will be 
interpreted under this condition as the transition of a photon from or 
into the zero state. 

The total energy of the photons taken alone (if this part of the 
system is referred to as B) can thus be represented by the operator 


Hy => En, => hy, bt b,, (427) 


where », = 0. The operators b, bt are introduced here not on the ground 
that a system of photons is describable in the configuration space hy 
a symmetrical wave function, but because we know that the photons 
conform to the statistics of Einstein and Bose, i.e. behave like material 
particles of the ‘symmetrical’ type. It should further be remarked that 
the quantities E,, = hv, are introduced here not by the general formula 
E,,= J p? Es, dy (since neither the operator Z,, nor the wave functions 
%(y) have a meaning for photons) but by way of definition. 

The part of the energy corresponding to the atom alone can be defined 
in the usual way. It thus remains to define suitably the interaction 


energy M = 5 V(X,y;), or rather the matrix elements V(X), the func- 
imo 


tion V(X, y;) being itself just as meaningless as the operator E,. 

In looking for such a definition we can be guided by the classical 
expression for the energy of an atom or electron in the electromagnetic 
field of the light waves. This field, according to classical electro- 
dynamics, is fully determined by its vector potential A as a function 
of the coordinates and the time, while the scalar potential ¢ can without 
any loss of generality be set equal to zero. The electric and magnetic 
intensity can be calculated with the help of A by means of the formulae 


E = — = H = curlA. 


Now the energy of an electron in an additional field specified by the 
vector potential A is equal, if terms quadratic in A are neglected, to 
e € 
<A v= om P,, 
where p, is the electron’s momentum. This formula can be taken over 
into the wave mechanics if p is defined as the operator a. In order 
TT 


to be able to treat this expression as the mutual energy of the electron 
and of the photons, it remains to split up A into separate parts, A, say, 
which may be assumed to correspond to the separate photons, and to 


§ 48 INTERACTION BETWEEN SYSTEMS 479 
find the matrix elements of A; with respect to the different ‘states’ 
r and s of the photons. Putting, for the sake of brevity, 


e€ 
cm (hides =e Bas 


where P,, is obviously independent of the individuality of the photon 
(specitied by the index 7), we thus get for the energy of the electron 
with respect to the light waves the expression 


M=p:>>P,, v{b,., (427 a) 


which can be interpreted as the mutual energy of the electron and the 
photons. The problem is thus reduced to the determination of the 
matrix elements. 

The simplest way to determine them is based on the assumption that 
the perturbation energy (427 a) must be responsible for such acts as the 
emission or absorption of light only. This means that the non-vanishing 
elements P,,, must correspond either to r = 0 or 7’ = 0. Since the 
number of photons in the zero state can be assumed to be infinite 
(i.e. actually indeterminate) the operators by == ay Vny and bf = vngaj, 
must also have infinite characteristic values, so that the matrix elements 
P,, and P,, must be infinitely smal]. All we need, however, is their 
products with 6} and b,. Denoting these products by v! and v, re- 
spectively, we can reduce (427a) to the form 


M = p-¥ (vib, +v, 0). (427) 


The operator p-v!b, determines the probability of emission and the 
operator p:v,b! the probability of absorption of a photon hy,. Our 
problem would be completely solved if we knew the dependence of v, 
and vi on hy,. This dependence can be found by comparing the quantum 
interaction operator (427 b) with the classical one 


e € 
me =P Pe 


where A, is the harmonic component in the Fourier analysis of A with 
the frequency v,. The energy per unit volume corresponding to this 
component is equal to (#°)?/87, where E? is the amplitude of the electric 
intensity (since in the case of light waves the amplitudes of the electric 
and magnetic vectors are numerically equal). Now according to the 


relation 120A 


Cis ue... 
c at 


480 SECOND QUANTIZATION § 48 


2a 
we have Et = A’. 
c 


The energy corresponding to a given harmonic vibration in the whole 
volume V of the enclosure where they take place is thus equal to 


jn (A®)?V. On the other hand, this energy must be equal to the product 


of hv, with the number of photons associated with the vibrations under 
consideration. We have therefore 


ai (ASV = hy, m, 


[Ate [| 
whence a Ar e Jae) (428) 


This expression, multiplied by the phase factor cos(2mv, t+ y,) == cos¢,, 
must obviously correspond to the quantum expression 


vib.+v,b! 
which can be written in a similar way if we assume that v, = vt and 
if further the operators a, = e!?7%/4 and a, = e-!27%lh are identified with 
the complex phase factors e# and e~*¢r, In the limiting case of very high 
characteristic values of n, we can treat vn, and a, as commuting 
(neglecting 1 compared with n,) and write accordingly 


¥} b,+V, b — v,(b,+5}) 
= V, Vn,(e#27Gdh 4+ e-t2n8,lh) ~~ 2y vn. cos¢,. (4288) 


Hence it follows that — AP = 2v,Vn,, 


which is identical with (428) if we put 
gen Oe 428 
i aa ~ mal(2nVv,)" (485) 
The direction of the vector v, coincides with that of A,, ie. with the 
direction of the electrical vibrations. The wave equation which deter- 
mines the motion and interaction of the atom (electron) with the 
photons can be written accordingly in the form 


h a 
-55 = > [Hy +hv, bt b,+-p-v,(6,+01)}w, (429) 
which can be obtained from the general equation (425) if we put F = 0, 


interchange x with y, and determine, as shown above, the interaction 


§ 48 INTERACTION BETWEEN SYSTEMS 481 
energy matrix V,. Substituting in (429) w = w’e-!7Wnlh, where 
W,, = > hy, n,, we can reduce the preceding equation to the form 
r 
— 5, = SHPO + 8)" (429) 
which is a special case of (425 b). 
Regarding M = ) p-v,(b,+6!) as the operator of the perturbation 


energy causing transitions between the stationary states of the atom 
(electron) with emission or absorption of radiation, we can determine 
the probability of such transitions by calculating the corresponding 
matrix elements of M. Now these matrix elements can be written in 


the form (J, n|M\J’, n') = p32 (D°V,) 7 (Op +O} an’: 


By the definition of the operators b we have 
bw, = 0, Vn, w(n,) = V(n,+ 1)w(n,+ 1)a,, 

whence it follows, in view of the orthogonality and normalization of 
the functions w, that the matrix element (b,),,,,, is different from zero 
only if nm; = n,+1, all the other numbers of the two sequences n, and 
n; (apart from ,) being the same. The value of this matrix element is 
equal in this case to ,/(n,+1). For the matrix element (b!),,,, we find 
likewise a non-vanishing value, namely, vn, if n; = n,—1, all the other 
numbers of the two sequences being the same cf. [eq. (4234), § 47]. 

We thus see that the probability of the emission of a quantum of 
frequency »v, is proportional to 


(P-V.) yy [?(m,+ 1), (429 b) 
while that of its absorption is proportional to 
I(P°V,) 7° |?m,s (429c) 


the proportionality coefficient being, of course, the same in both cases. 
The energies of the two states of the atom J and J’ must differ from 
each other by an amount approximately equal to +Ayv,. The fact that 
the absorption probability is proportional to the number of photons in 
the initial state, i.e. to the energy of the latter, is quite natural. It is, 
however, very remarkable to find that the emission probability is pro- 
portional to the number of photons not in the initial, but in the final 
state, being thus different from zero even if n, = 0, i.e. if no photons 
of the given sort were present at the beginning (except in the zero 
state). This result gives an interpretation of the spontaneous emission 
of light as stimulated by a photon which was initially in the zero state. 
The sum 7,+-1 in (429b) can be interpreted accordingly as the expres- 
3595.0 3Q 


482 SECOND QUANTIZATION § 48 
sion of the fact that the emission of light takes place in two ways, 
namely, as a result of the stimulative action of the light already present, 
the probability of this induced emission being exactly equal to the 
probability of the absorption, and also spontaneously. The ratio m,:1 
must therefore be equal to the ratio Bp/A of the probability of absorption 
or induced emission to the probability of spontaneous emission, A and 
B being the well-known Einstein coefficients (see Part 1, §§ 17 and 18) 
and p the density of the energy per unit volume and per unit frequency 
range. 
This result can easily be verified. We have, in fact, 


p dv = a 2 n,hv,, 
V dv 


where the summation is extended over all the frequencies within the 
given range. Now, as has been shown in Part I, §§ 11 and 37, the number 
dz of free oscillations of any kind in an enclosure with a volume JV, 
whose wave number lies in the range dk, is equal to 4n Vik? dk. Applying 
this to light oscillations (with a given state of polarization) we get, 
since k = v/c, deel? 


If n, is considered as a practically continuous function of the frequency, 
it can be assumed to have the same value for all oscillations within the 
small range dv. We then get 


pdv= 7m, he, a= <n, bo? dv, 


whence Pa =a hy’, 
Ny  € : 
which actually coincides with the ratio A/B found in Part I, eq. (103 a). 
We thus see that the theory of the emission and absorption of radiation 
developed in this section (and due to Dirac) has the advantage of inter- 
preting the spontaneous transitions with emission of radiation, actually 
combining such spontaneous emissions with the induced ones. 
It is easy to obtain from the above theory the absolute values of the 


emission and absorption probabilities. To do this we must multiply 


the expressions (429 b) and (429c) by (7*/A*) and further by e = 4nV 2 


—--- py ’ 


dy c 

so long as we are interested in the emission or absorption not of a 
particular photon with the frequency », and a given direction of motion, 
but of aay. Picton with a frequency lying within a narrow range Av 


§ 48 INTERACTION BETWEEN SYSTEMS 483 
irrespective of the direction of motion. In view of the unsharp character 
of the resonance, summation of all the transition probabilities within 
the range Av leads to a result which is independent of the actual 
magnitude of Av. 

The resulting probability of a ‘spontaneous’ emission, for example, 
per unit time and unit frequency range thus turns out to be 


2 4nVis 
A= (PV, ) 5,05 ag 
Substituting here the expression (428b) for v, and denoting the com- 
ponent of p in the direction of the vector v, (i.e. the direction of the 


electrical oscillations) by p,, we get 


2 2 
As i si 


hm? 3”? 
if p, is replaced by mt = m2m, i 
or, If p, is replaced by m—— = m2zy, tz,, 


82rte2p3 , 
A=. A (2p) ya |?s 


which coincides with the formula (93) of Part I if we take account 
of a definitely polarized radiation only. 

In order to account not only for the emission and absorption but 
also for the scattering of radiation, we must consider the hitherto 
neglected term of the perturbation energy, which is proportional to 
the square of the vector potential A. 


2 
Subtracting from the operator oe (P — ; A) the operator p?/2m which 


corresponds to A = 0, we find for M—the operator of the mutual 
energy between the electron and the light—the expression 


2 
2 Ae (430) 
differing from the previous one by the extra term e?A?/2zmc?. 

In order to find its quantum interpretation let us put A = 5 A,, 
where A, = A®cos¢, denotes a harmonic component of A. This gives 
A=) AA, = Zz Pa A;-Aj COS , COS P,, 

and consequently, according to (428 a), 
e2 
M, = Sct = 2m p32 2 V,'V, Vn, Vn, COS , COS $, 


= $m ¥ Y v,-v, Vn, Vn, (e#r+-e-1#)(e'Fe+ eit), 


484 SECOND QUANTIZATION 3 48 
which in view of the correspondence between the complex phase factors 
e'¢, e-% and the operators a = e%?74/4, gt — e-t274h can be considered 
as the approximate form of the operator 


M, eo 3m = ps v,'V, (0! b,+0} b,) =m > > V,'V, b} b, 
if we leave aside extra terms of the type 4 > > v,-v,(6,6,+616!) which 


correspond to a double emission or a double absorption and which do 
not seem to have any real physical significance. Substituting here the 
expression (428b) for v and denoting by 6,, the angle between the 
directions of the secs vibrations of the types 7 and s, we get 


cos 6,, bt 
mg apt > D4 eae ve) 


This operator, considered as a perturbation energy, determines the 
probability of those transitions, in which one photon (hv,) is absorbed 
and another (hy,) is emitted. Since the state of the atom must not 
change [this follows from the fact that its coordinates do not explicitly 
appear in (430a)], i.e. its energy must remain the same, the two fre- 
quencies v, and v, of the absorbed and emitted light must likewise be 
the same; we thus have to do with a change of its direction only. This 
is the normal coherent scattering. As has been pointed out in § 23, the 
scattered light can in reality be different from the incident one (as in 
the Raman or Compton effect). The above theory cannot be extended 
to such cases of combined scattering. 


49. Electromagnetic Waves with Quantized Amplitudes; Theory 
of Spontaneous Transitions and of Radiation Damping 


The preceding theory (due to Dirac) can be greatly simplified if, fol- 
lowing Jordan, Pauli, and especially Heisenberg, we do not explicitly 
introduce the notion of photons but treat the phenomena of light from 
the point of view of the wave theory, replacing, however, the classical 
electromagnetic waves by waves (oscillations) with quantized amplitudes. 
Let 4(z, y, z, t) denote a plane harmonic wave of some quantity ¢ charac- 
teristic of the electromagnetic field—electric or magnetic field-strength, 
scalar or vector potential, etc. It may be a wave travelling in a definite 
direction or a standing wave formed by the superposition of two waves 

t Asa matter of fact, it is not strictly applicable even to simple scattering: if instead 
of the Schrédinger equation containing terms quadratic in the potential A, we used 


Dirac’s equation which is linear in A, we should obtain to the first approximation (corre- 
sponding to simple transitions) no scattering at all. 


§ 49 WAVES WITH QUANTIZED AMPLITUDES 485 
of the same frequency and amplitude, travelling in opposite directions. 
In the former case we can put 
(2, y, 2, t) = Cyei2ner—v) + O% e-idn(e-r—vb), (431) 
where r is the vector with the components 2, y, z and k the wave vector; 
the magnitude of the latter is connected with the frequency by the 
relation k = cv, c being the velocity of light. The two amplitudes Cy, 
and Cf must be conjugate complex quantities so that ¢ may be real. 
The expression (431) can be rewritten accordingly in the form 
f(x, y,2,t) = A, cos 27(k-r—vt)+ By sin 27(k-r—vi),  (431a) 
where A, and By, are two real coefficients. Taking the sum of the 
expressions (431) or (431a) for various magnitudes and directions of 
the vector k (forming a discrete or a continuous sequence) with suitably 
chosen complex amplitude coefficients C, (or Ay, By), we can represent 
the value of the quantity ¢ as a function of the space coordinates and 
of the time for any electromagnetic field in ‘empty space’, i.e. satisfying 


d’Alembert’s equation 1 od 


V2 — (432) 


c? at? 
It should be kept in mind, however, that this representation does not 
hold for an electromagnetic field produced by electric charges situated 
within the region under consideration, since such a field is determined 
by a non-homogeneous equation of the form 


2 
eS ee oe (432 a) 


p being the volume density of the charges if ¢ is the scalar potential, 
or the electric current density if ¢ is the vector potential. 

So long, however, as we are dealing with radiation, we may safely 
assume equation (432) to hold, and accordingly represent its general 
solution in the form of a sum (or integral) of the expressions (431). 

The transition from the classical electromagnetic theory of light to the 
quantum theory can be achieved in the simplest way (without intro- 
ducing the notion of light quanta) by regarding the amplitude coefii- 
cients Cy, Cf not as ordinary complex numbers but as non-commuting 
quantum operators proportional to the operators 6, bt which have been 
used before with conjugate complex proportionality coefficients yy, yE 
which are determined by the normalization condition for the function 
dy. Adding to k the further suffix € to indicate the polarization 
(§ = 1, 2), we obtain the following quantum expression for a plane 
polarized harmonic wave of light, 


dx,¢(2, ¥, 2, t) = Y«,t by ¢ elta(e-r—vl) tof bk ¢ eee. (433) 


486 SECOND QUANTIZATION § 49 
The substitution of the operators 6, b' for the coefficients C, C* secures 
the ‘quantization’ of all those quantities which are expressed as volume 
integrals of the square of ¢ (extended over the whole region in which 
¢ is different from zero). 

Thus, for instance, taking the square of (433) and integrating over 
a volume V outside which ¢ can be assumed to vanish, we get 


[ big dV = regrteVObelug thee bhp) (433) 
V 


the squares of the two terms of (433) giving no contribution to the 
integral on account of the periodic factors e*!47*F, 
Now by the definition of the operators 6, b' we have 


btt = N, bbt = N-+-1, 
where J is an integer or, more exactly, an operator capable of assuming 
integral positive values only. Affixing to it the suffixes k, € which 
specify the oscillations under consideration, we thus get 


| Pig UV = Zlyyel?V (Ne et 4). (433 b) 
f 


If ¢.¢ is identified with the electric intensity #, the expression (433 b) 
divided by 47 can be interpreted as the clectromagnetic energy Wy¢ 
enclosed in the volume V (since the magnetic part of the energy is equal 
to the electric one). Putting 
2mrh 
Yee = [ap = rhe ( = ch), (433c) 


we obtain for this energy the expression 
W = (Nuetd)hy, 
which differs from that of the photon theory by the presence of the 
term 3 in the brackets (N being the number of photons). 
In order to get rid of this term one usually replaces the sum btb+-bbt 
in (433a) by 26'b, thus putting 


[ Pkg dV = inne ebee = 2irucgl* Mee 


which, however, is a wholly unwarranted procedure. It can be shown, 
however, that in the accurate expression of the electromagnetic energy 
which involves the sum of four terms (corresponding to the scalar 
potential and to the three components of the vector potential) or of 
six terms (corresponding to the three components of the electric inten- 
sity and the three components of the magnetic intensity) the } cancels 
out so that the energy reduces to an integral multiple of hv. 


§ 49 WAVES WITH QUANTIZED AMPLITUDES 487 

In the general case of an electromagnetic field represented by a sum 
of terms of the form (433) satisfying given boundary conditions (corre- 
sponding, for example, to radiation enclosed in a vessel with perfectly 
reflecting walls), the integral { 6? dV, on account of the mutual ortho- 
gonality of the different normal oscillations ¢, ¢, reduces to the sum of 
the expressions (433 b) for all the values of k, € concerned. 

We shall now apply the method of quantized electromagnetic waves 
to the interaction between light and matter. The light will be con- 
sidered as a perturbation and the matter described in the usual way 
by a superposition of the stationary statcs that would persist in the 
absence of the perturbation, i.e. 


b= Lady = S appglayertmm 


The amplitude coefficients a, will be treated to begin with as ordinary 
numbers; for the sake of simplicity the matcrial system will be imagined 
to consist of a single electron bound to a fixed centre of force (hydrogen- 
like atom). 

The perturbation due to the light will result in the variation of the 
coefficients a with the time; this is determined by the well-known 


equations h da, oe 
Oni ab = - S,,G,: (434) 
The perturbation energy can be written in the form 
ie — Td = > T' du ¢, (434 a) 
k.£ 


where 7' is some quantity characteristic of the atom, for instance, its 
electric moment if ¢ represents the electric force. 
Substituting in (4384 a) the expression (433) for ¢, ¢, we get, 
S => y, (Tr b, e-te! +. 1 bt ef2nval), (434 b) 


‘ a cig e 
where the index a is an abbreviation for k,é; T+ = — p, ett?" if ¢ 
cm 


denotes the vector potential, and v, = ck. Hence we get 
Bre = > Val(Tr) rg bg 2M + (TT), bY ef 2m rretvaX}, 
a 


So far the present theory is formally identical with the previously 
considered theory of the perturbation produced by classical (i.e. non- 
quantized) electromagnetic waves. We can therefore use for the ampli- 
tudes a, the same approximate expressions as have been derived 
before [(175), § 22]. It must be remembered, however, that the corre- 
sponding probabilities |a,|*—just as the probability amplitudes a, 
(r :4 s)—are to be dealt with not as ordinary numbers but as operators. 


488 SECOND QUANTIZATION § 49 
In order to obtain results comparable with the experimental data we 
must consider the characteristic values of these operators, or their 
probable values for a number of states corresponding to different charac- 
teristic values. We need not discuss here the method of calculating 
these probable values since in the applications they are usually known 
a priort. The important thing to be noticed is that the use of quantized 
electromagnetic waves involves the introduction of ‘second-order pro- 
babilities’, ie. of the probability that the ordinary (‘first-order’) 
probability of some state (7) should have a given value, out of a number 
of possible characteristic values. Instead of directly giving the value of 
the transition probabilities, the operators |a,|? considered as functions 
of the time (with the condition that at ¢ = 0 one of them only has 
a characteristic value different from zero), will serve to determine the 
probable (or average) values of these transition probabilities. 

Another important point is the fact that in calculating the probability 
operators |a,|* we must take into account the non-commutative character 
of the operators b,, b!, whose squares or products occur in the expression 
of the product of a, with a*. It thus becomes necessary to define in 
an unambiguous way the order in which the operators a, and a* must be 
multiplied by each other. This order being adequately fixed, the com- 
mutation relations which are satisfied by the operators 6,, 6! enable 
one to incorporate in the perturbation theory of the radiative transitions 
those transitions which are classically distinguished as spontaneous on 
exactly the same footing as the ordinary ‘induced’ ones. 

We shall consider, just as in § 22 (or §18, Part I), a radiation with a 
practically continuous spectrum (such as the thermal radiation in 
statistical equilibrium at a given temperature). Assuming the material 
system (atom) to be initially in a given state s, we get to a first 
approximation (r > 8) 


] e827(Vre—Vat ] e22t(Vre+Vat ] ~ 
a, = Aa, == v4 > Ya{ba ea ye Pale t a are ee ol a} 
rs 


h 


a 
where a? = a?* = ]. 
Let us consider in the first place’a transition «—>r to a state of 
higher energy W, > W, under the condition of unsharp resonance 
with the electromagnetic waves in a small frequency range near 
Va = Vp, = (W,—W,)h. We can then drop the second term in (435) com- 
pared with the first one. It now remains to multiply a, by its conjugate 
complex, dropping all terms containing the b,’s with different values 
of « and to sum over the frequency range considered. 


§ 49 WAVES WITH QUANTIZED AMPLITUDES 489 

Before we do this we must, however, make the following important 
remark about the order of the factors in the product of a, with a*. 
According to (435) a, (r +s) must be considered not as an ordinary 
number but as an operator of the same type as 6; its conjugate complex 
must be replaced accordingly by the adjoint operator 


a} &! (Aya,)! = — Fal Sy (PEO 
Veg Vay 

{which corresponds to the first term of (435)]. 

Correct results are then obtained if the operator which determines the 
probability of the state is defined by the product ata, and not by a,at 

In carrying out the summation over the different oscillations we can 
drop all those products bi bg for which « ¥ B (in view of the supposed 
incoherent character of the radiation). This gives 

(av) 


ee 2 676, aie 


[2n(y —y rel — 118 
a e is 
aha, ~ aprag (Tre 5 


V—Vpg 


d(v—v,,), 


where v, = »,, is the resonance frequency, |(7';),,|2 the average value 
of |(7'5),,|* for all the directions of the vector k with the fixed magnitude 
c/v», and Av a small frequency range containing the resonance frequency 
and yet large enough to make the integrand very small compared with 
1 for v—v,, = +Av. The integral being equal to 47*t, we thus get 


ata, apts 78 (TET Oat 


Let Z, Av be the number of different oscillations in the frequency 
range under consideration, i.e. the number of summands in Sot b, 
For isotropic thermal radiation Z, Av = sll v*Av [ef. Part I, § 29, (141)]; 


we can then put ip 


2 nb _ 7 ob, 
v 
and consequently Z 
ata, = ao*a? Bray eS (435 a) 
4n2V Wye 
where By = Fa, Yel adrol” (¥ = M%4). (435 b) 


Let us now consider the opposite transition r > s due to the (unsharp) 


resonance with electromagnetic oscillations of the same frequency as in 
3595.6 3R 


490 SECOND QUANTIZATION § 49 
the preceding case. Reversing the indices s and r, we obtain for the 
probability amplitude of the transition r — s the expression 


] % - e227 (Var—Va)t__ ] 
a, = Aya, == er > Voi ally, He = 


a ar 


a 


@t2m(Ver i Mat 1) 
H(T aad ee nh. (436) 
Ver T Yq J 
Since »,, = —-v,,, and consequently v,.+», 2 0, we can now drop the 
first term of this expression and not the second one, which gives 
Gd = aa Bo Z bv bo 0..t. (436 a) 


and also in a minor way through the substitution of By for Bx. 

Now we have bt b, =: Ne b bt = N.4-1, 
where NV, is the operator representing the (integral) number of light 
quanta associated with the oscillations of the type a. Passing from 
operators to probable values, we get 


——- = NV 
4, hy b) b, == , - hy = p,, 


l” 
where p, is the spectral density of radiation per unit volume and 
— = (N+ Vhv = p{L+ =a hy. 
Hence the probable values of the probabilities for the transitions s > r 
and r > s referred to unit time are 

l= Bip, if W, >My 
= Srv? m 

and T= Baa(P tay Be) = Anat BoP (436) 
We thus see that on the present theory ‘spontaneous’ transitions from 
a state of higher energy to that of lower energy become completely 
fused with the induced transitions of the same type. The relation 


Sav? 


A,, = =; WwB;, 


6, oF 


between the probability coefficients A and B referring to spontaneous 
and induced transitions is just that which has been obtained in Part I, 
§§ 17 and 18, by the method of ‘classical’ electromagnetic waves. The 
only difference consists in the multiplication of the quantity 7 charac- 
terizing the atom by the factors e+'7* characterizing the radiation, 
which corresponds to the introduction of two somewhat different coeffi- 
cients BY (for absorption) and B;, (for emission) instead of the single 


§ 49 WAVES WITH QUANTIZED AMPLITUDES 491 
one considered before. It should be remarked, however, that for an 
isotropic radiation characterized by all the directions of the vector k 
being equally probable, the two coefficients are identical. If, moreover, 
the wave-length A = 1/k is large compared with the effective linear 
extension of the atom the factors e+*?7** can be dropped altogether. 
The expression (435b) reduces in this case to that obtained before 
(Part 1, § 17), if 7 is defined as the electric moment of the atom in the 
direction of the electric intensity ¢. Substituting the corresponding 
expression (433¢) for y, in (435b), we get, since 
POT = Be l1*+ Leal ®t lerel*) 
__ 8n* 2 ae 
B,, er 3n2 © (|2,y| + [Yrs + l2,e| )s 


in agreement with (103), § 18 of Part I. 

As a second illustration of the method of quantized electromagnetic 
waves we shall apply it, following Rosenfeld, Weisskopf anc Wigner 
to the problem of the radiation damping. 

Let us return to the perturbation equations (434) and let us assume 
for the sake of simplicity that S,, is different from zero for two states 
only, r = 1 and s == 0 say (the diagonal elements S,, and Soy likewise 
vanishing). 

The equations (434) reduce under these conditions to the following 
two: 


h, h., 
eet —— By on ° = ga;, (437) 


where =f = D yal (PF) 9 bg ef? 0-H + (Tr Jaq Di et2trr0 1 va] 
9 = D Val(T ds )oy bg, ef2™—r10-Yol + (T'F Joy bt ef2m(-v0 tM], (437 a) 
a 


We shall assume that at the initial moment (¢ = 0) a, = 1 and a, = 0, 
which means that the atom was initially in the excited state, and shall 
try to solve our problem more exactly than was done before (when 
a, was considered as constant) by putting 
=e (438) 
This corresponds to a radioactive-like decay of the number of atoms 
in the excited state (1) owing to their spontaneous transition to the 
normal one (0). Substituting this expression in the second equation 
(437) and integrating, we get 
a ] " e-f2(vy0+¥q—-tT 1 
a= 5D, ral Taba iP) 
et2r(va—Mo +i X__ ] 


+ (Toor bs —- =|, ee) 


Veo tel 


492 SECOND QUANTIZATION § 49 
The first term in this sum can be dropped so long as », lies in the 
vicinity of v,) just as in the derivation of (436 a). 

In order to find the decay or damping constant I’ we must substitute 
this expression in the first equation (437) with due account of the order 
of the factors b, b' and sum over all the a's in the resonance range dv. 
In doing this we can drop the second term in f, for in view of the 
incoherent character of the oscillations the probable (average) value of 
bt x OB vanishes both when «a + 8 and a == B, oe tere! non-vanishing 


terms being those containing the products 6,b! = N,+1. We thus 
find 
h onl t . e- -t2r] U__ pt2m(vie—Vvad 
8a “ 5:2 - N,+1 —-————-, 
7Pe Ya 2's rol Tx Jor Na+ 1) -—— 7 ewer 


l1—e? sabe -ily¥ 


438 b 
Uvio—Ya ve iD) 


that is, P= me YATE rol Ta )or(Na+ 1) -. 
Replacing here the operator N, as before by its probable (average) 
value (c3/87hv*)p,, and further replacing the summation over a by an 
integration over v with the expression Z, dv = 87Vv? dv/c* for the num- 
ber of «-values in the range dv, we get 


+00 
a Ye a a c3 ] — et27(¥yo-v -iP 
P= ea a 10( 25 Dor Z,( lo ent] | f Uv—ry +i) | d(v —V49); 


where v, denotes the resonance frequency r49,... and (Z7't)49(7';)o, the 
average value of the product (7'5)i9(7'; )o, for all the directions of the 
vector k with the fixed magnitude 4 == v/c. The integral J appearing 
in this formula is easily seen to be independent of the value of the 
parameter I" and to be equal to 7. We have in fact, putting Tl = 0 
and v—vy) = &, 


7 fice ge csct f- cesta dé + {4 Qmté dé. 


The first term obviously vanishes since the integrand is an odd function 
of £, while the second reduces to the well-known integral of Laplace, 
which is equal to 7. 

Thus, if we neglect the difference between the factors Tj and TZ 
replacing them simply by 7' (which is always permissible if the resonance 
wave-length, A = c/v49, is large compared with the effective dimensions 
of the atom), and take into account the relation (435 b), we obtain for 


§ 49 WAVES WITH QUANTIZED AMPLITUDES 493 
the constant I’ the following expression: 
1Z é 
P= ——/14+ — . . 
4a 7 + gpaPe] " 
whence 4nT = Aj94- Big p,. (439) 


This quantity is usually denoted as the damping exponent since the 
number of atoms in the excited state a{a, decreases with the time as 
e~47I", Under ordinary circumstances the second term in (439) is small 
compared with the first one, so that the damping constant is numerically 
equal to the probability of a spontaneous transition between the corre- 
sponding states. 

In the general case when the atom is initially in an excited state r 
from which spontaneous transitions are possible to several states of 
lower energy 8, the damping constant is equal to the sum of the corre- 
sponding transition probabilities 


4nl = SA, (W,< Wi). (4398) 


The probability amplitude of the rth state a, decreases with the time 
like e-2*-!, Multiplying this expression by yw, = °(z)e-i27! we can 
treat the resulting function 


a, ih, = UPlarjer fant 


as representing damped vibrations, corresponding to a complex value of 
the frequency v,—2I}. Such damped vibrations starting at a certain 
instant ¢ = 0 can be analysed into a series of undamped harmonic 
vibrations, according to the equation 


4-00 
f(t) = e~i2n(v,—iT,t — i; A ,e-teavt dv (t > 0), 
—x 
where A, = | f(t)e?™ dt = | eitmv—ve Hil dt 
0 rT) 


tr : 1 
= | e-20iT,-i(v-»)¥ dt — ... —— ..__- =z 
J 2n[T.—i(v—v,)] 
1 
an en ee 7 4 
This corresponds to an effective spectral width I> of the state in question 
—in agreement with the interpretation of complex energy values given 


in Part I, § 15, in connexion with the problem of radioactive decay. 


494 SECOND QUANTIZATION § 50 


50. Application of Quantized Electron Waves to the Emission 
and Scattering of Radiation 


If in the function 


(x,t) = 2 a, (x,t) = p2 a, Ye(are-*27rr!, (440) 


representing the undisturbed motion of the electron, the coefficients 
a, are treated not as ordinary complex numbers, but as operators 
satisfying the relations 

ata,ta,ai=8,,  4@,@,+a,a,=0, atattatat=0, (440a) 
% will represent the motion of any given number of electrons, distributed 
over the individual states 4,, the number of electrons associated with 
a particular state r being defined as the characteristic value of the 


operator ala, == n, (440 b) 


(i.e. 1 or 0); it should be remembered that the product a@,a! is equal 
to 1—n,. 

It has been shown by Heisenberg that with this definition of ¥% corre- 
sponding to quantized electron waves, it is possible tu give an adequate 
description of the emission (and scattering) of radiation in terms of the 
classical electromagnetic theory, if, following Schrédinger, we replace 
the classical mechanical quantities (coordinates, velocities, etc.) by their 
average or probable values. 

This wave-mechanical theory of light emission has been discussed 
already in Part I, § 17, with the help of ‘classical’ (ie. unquantized) 
electron waves as giving rise to classical electromagnetic waves. It has 
been shown there that light vibrations defined as ‘beats’ (‘difference 
tones’) between two electron waves have correct frequencies, but that 
their amplitude is proportional not only to the probability of the initial 
state but also to that of the final one—which contradicts the photon 
theory of radiation. Now this contradiction can be removed if the 
‘classical’ electron waves are replaced by quantized ones; the resulting 
electromagnetic waves appear likewise as quantized although in a way 
somewhat different from that considered in the preceding section. 

The mechanical quantity which determines the radiation emitted by 
an atom can be defined according to Schrédinger’s theory as the 
probable value of the electric moment of the atom 


P = [ y*Py av. 


If we are concerned with several electrons P must denote the sum 


§ 50 APPLICATION OF QUANTIZED ELECTRON WAVES 495 
> er;, where r; is the radius vector of the ith electron (with respect to 
1 


the nucleus), and y an antisymmetrical function of the coordinates 
of all the electrons, the integration being extended over the whole 
configuration space. Introducing quantized electron waves, we can 
represent the totality of the electrons by the three-dimensional operator- 
function (440), and replace the preceding expression for the probable 
value of the resulting electric moment by the operator 

P = [ otPy ay, (441) 
whose characteristic values must be considered as the probable values 
of P. Just as for the quantized electromagnetic waves discussed in the 
preceding section, we are thus concerned with probabilities of the 
‘second order’, i.e. the probabilities of certain probable values of P, the 
corresponding second-order probability amplitudes C being defined by 
an equation of the form PC = P’C. As a matter of fact, we need 
not bother about these probabilities, for the quantity we are actually 
interested in, and which can be directly compared with the experimental 
facts, is the probable value P of the operator P, which, as we shall 
presently see, can usually be determined directly. 

It should be emphasized that the order in which the two factors 
ys’ and % appear in the expression (441) is an essential feature of ,this 
expression, since these factors do not commute with each other. We 
should obtain wrong results if the operator P were defined as { ¥Pyt dV. 

Substituting in (441) the expression (440) for y, we get 

P(t) = > Data, Po, eftmnt, (441 a) 
and consequently ie 
d?P(t) _ -> > at a, P®,(Qrrv,_)%et27>mt, (441b) 


This expression can be considered as defining in the same way as in 
the classical electromagnetic theory the electric and magnetic field 
generated by the atom at sufficiently remote points. 

The electrical intensity in a given direction 7, say, at a distance R 
from this atom (the unit vector + being perpendicular to R) is thus 
represented by the operator 


E,(R,t) = —=, P(t—R/e), 
c being the velocity of light, iat te is, 
1 
= - + — ___ t 29187r74(t— Ric) 
E, a E, +E; ore c2R 2 a} a,( P?)yg(2777,5) e ’ (442) 


496 SECOND QUANTIZATION § 50 
where E> corresponds to terms with negative frequencies and E+ to 
those with positive frequencies. 

The electric field defined by (442) is an operator, of a type somewhat 
similar to that defined in the preceding section with the help of the 
operators 6, b', the operators ata, corresponding to 6 if v,,<0 
(W, < W,) and to 0! if v,, > 0 (W, > W,). The connexion between the 
two types of operators will be examined later on. We are concerned 
here only with the fact that in order to obtain the observed electric 
field we must take the characteristic or probable values of (442). In 
the absence of definite phase relations between the operators a, and a, 
referring to different states, i.e. when the different harmonic terms in 
(440) are incoherent with regard to each other, the probable values 
of ala, are equal to zero so long as r ¥ s, so that the probable value 
of (442) vanishes. This is practically equivalent to the fact that 
the average value of E, with respect to the time is equal to zero. The 
quantity we are interested in is, however, not the electric field-strength 
but the corresponding energy. According to the classical theory, the 
latter (or more exactly the energy-density) is proportional to the square 
of E,. In order to obtain the operator which serves to define the energy 
in agreement with the photon theory of radiation we must, instead of 
squaring E,, multiply £* by EH; in the order stated (just as in the 
preceding section where H+ was replaced by ¢' and E- by ¢). This 
gives 
ae a 2 b > > a, a, ay a,(167r4v2, V79') Pre Pe ee, 

r>-8 r’ 8" (442 a) 


it being understood that v,, > 0 if r > (the index 7 is dropped for 
convenience). 
We shall take in the first place the time average of this expression, 

which can be done by keeping those terms only for which v,,+,,, = 0, 
that is, r’ = r and es’ = s. We thus get 

0 1 

Et+k- = c#R? > > a}a,a} a, | Pre |*(2zrv,4)4. 
It should be mentioned that the same result is obtained by averaging 
over the phases of the operators a,, etc., if they are assumed to corre- 
spond to incoherent vibrations. 


r>8s 


= a =z 
Now  ata,ata, = —ata,a,at = ata,a,at = n,(1—n,), 


eet 
cotht = EXE => > m(—n)IPRIN2m,,) (442) 


§ 50 APPLICATION OF QUANTIZED ELECTRON WAVES 497 
This formula shows that the intensity of the emitted light is equal to 
the sum of terms corresponding to a combination of two states (r, 8), 
provided the upper state is occupied (n, = 1) and the lower vacant (n, = 0). 
This result is in complete harmony with what should be expected on 
the photon theory of light emission in connexion with Pauli’s exclusion 
principle. The formula (442b) can thus be regarded as the improved 
version of the ‘classical’ wave-mechanical equation (92) of Part I, § 17, 
where the upper and lower states appeared in a quite symmetrical 
manner. Indeed, we come back to this result if we consider the ampli- 
tudes a,, a, a8 ordinary numbers and not as operators. 

If in the expression (441 a) a, and a, are multiplied by the damping 
factors e~??7!+! and e-‘27l' the light vibrations with the frequency 
vp, = (W,—W,)/h due to the combination of the corresponding states 
appear as damped with the damping constant 


he = r+T, oz > Arpt 2 A,» 
per a 


The effective width of the spectral line emitted in a transition from 
one state to another is thus equal to the sum of the widths of both the 
initial and final states. 

We shall now investigate, with the help of the formula (442), the light 
emitted by an atom under the perturbing influence of ‘primary’ electro- 
magnetic waves, or, in other words, the phenomenon of the scattering 
of radiation. As has been shown in Chap. V, § 23, the interpretation 
of this phenomenon from the purely mechanical point of view neces- 
sitates the consideration of double transitions, which correspond to the 
second approximation in the solution of the perturbation problem. If, 
however, we consider the radiation emitted (scattered) by the perturbed 
atom, we can confine ourselves to the first approximation, which in 
conjunction with equation (442) gives equivalent results. 

Let us for a moment treat the coefficients a, as ordinary numbers, 
and define the electric field of the primary light waves by the expression 
E® = 4(be-t27Ht 4 Htetaavty 
where 6 is the (complex) amplitude of H°. Let us assume further that 
at the initial moment, ¢ = 0, the coefficients a,, a,, a, are different 
from 0, while all the other coefficients a,, a,,... vanish. We then get 
from (435), with 7+ replaced by P,, the component of P in the direction 
o of the vector H#°, and the summation over a by a summation over q: 


] e~t2m(v—vp_ ] eitn(y +¥r—h __ ] 
6 = hie se > OPS) Bie 

r 1%, Qh ps3 a,( al Vpg— + b Vgtv 
3505.6 38 


498 SECOND QUANTIZATION § 50 
Substituting this expression and the similar expression for the conjugate 
complex 


] on 2ar(v— eat — =I ios rn — J 


- “Vpg—¥ - Vag-t 
in the formula (441 a) for the electric moment or its projection in a given 
direction 7 and dropping small terms of the second order, we have 
P(t) —_ 2 >| [A, at a®, 9(P2) 0 etmrrett Got A 12,( PS), 2% rf, 
If furthermore we drop irrelevant terms which do not contain the 


primary .frequency (they can actually be considered as fading away 
owing to the damping), we get, with the help of the relations 


Veg t Voir = Vag = —Maq’s 


a 5 wl t err a Mabie ds a 
P(t) = xm p3 Z (agra. a°(P Se P2) |B <= +6 Sr ae 


e~t2m(v—vq'ght et2a(v +rggt 
+o 


agai P2)oy(P2)r| 2 i anil Vag tv 
rq 7q 


or, rearranging the different terms, 
P(t) = 2 2 [astag Unga be-t2my—reek + ghtad, Ugg’ bt efemy—ve'el] +. 
q 
+ 2 2 [a?tae, Usa’ be- fav +a + attad ute bt eid tye), (443) 


In this formula 


“— eens a areal 
“oa ~ Oh og ~ oh -) 
(P2)(P2) (Paps P2hy [OO 
ania \* olgr\* r/rq’ = q‘r\* o/rg 
Maa ~ ~ oy sD. Vagtv ” “ia Qh poe Vpg-ty 
The electric field strength of the scattered radiation at a distance R, 


E(t) = — igh -3), is thus given by the formula 
E,(t) = aR {[27(v—v_-_) P[agiag Ure be-i2mv—v¢'gXt-Ric) 4 
+aota®, uz, btetemy—reot—Riey}, 
_ apllen+ Vag) |*[ a9 'a9. wd, be~ try treet Ric) +. 
+astae Ugg bt eter tye eXt-Ric))} (443 b ) 


Although in the preceding calculation the coefficients a,, etc., were 
dealt with as ordinary numbers, the results obtained remain valid if 
we regard them as operators, since in writing down the products afa,, 


§ 50 APPLICATION OF QUANTIZED ELECTRON WAVES 499 
etc., we have always preserved the correct order of the factors. The 
smallness of the first-order coefficients A, a, must be understood to mean 
in this case the smallness of the average (or probable) values of the 
corresponding probabilities |a,|* (i.e. the predominance of the charac- 
teristic values |a,|? = 0). 

Let us consider separately the special case when the atom is supposed 
to be initially in a definite state g. The double sum (443) reduces in 
this case to the single term 


P(t) =e oe (be-t27t_t bt ei2avty (444) 
’ mess ee Var Pear Para 
where Woq = 7 5) ae (444 a) 
and the electric field strength (443 b) to 
es _ (2a)? — R 
E,() = —" ap Blt (444 b) 


The scattered radiation has in this case the same frequency as the 
primary one. This is the so-called simple or Rayleigh scattering. In 
the general case of equation (443 a) we obtain in addition to this simple 
scattering a ‘combination’ or Raman scattering with a number of 
modified frequencies v-F v,-.. . 

In order to obtain the average energy of the scattered rays we must 
take the square of EF, or, more exactly, the product of HS with E>. 


For Rayleigh scattering this gives 


E} E> ~ w3,(ahta?)*btb, 
that is, E} Ey ~ wi ni btb = win, b'b, (445) 


since the characteristic values of ni and n, are the same (1 or 0). 

For the Raman scattering the situation is somewhat more com- 
plicated. We shall consider separately the scattered rays with the 
frequency v—v,,, and those with the frequency v+»,, 

Taking the time average of L+ E>, according to (443 b) we get 

J vag ~~ U—Vatqg) ag Ugg Aha. adtas bth. 
that is, J vgg ~~ (UVa) Mag Ugg NI —n?); (4465 a) 
and in a similar eg 
bogie ~~ (vty fe) tly Uda’ atadaotat.b'b 
or Dysvarg ~~ UVa) MUgg Ugg 23-1 —ng)b*b. (445 b) 
These results are in harmony with the experimental data and with the 
elementary theory (due to Smekal) of the Raman effect, based on the 
idea of photons. In order to secure complete agreement we must make, 


500 SECOND QUANTIZATION § 50 
however, the additional assumption that v,, is positive, ie. that 
W, > W,. 

The scattered photon with the decreased frequency v—v,,, is obtained 
on this view if the atom was initially in the lower state (n? = 1), the 
higher state q’ being vacant (n. = 0). In the contrary case (n?. = 1, 
n? = 0) the atom jumps from the higher state to the lower one, adding 
the energy hv,,, to that of the incident photon, which results in the emis- 
sion of the scattered photon with the increased frequency v+-v_,. 

It should be mentioned that the intermediate states r, which deter- 
mine the intensity of the scattered radiation through the factors u+, 
in contradistinction to the final state g or q’, need not be vacant, 
since the corresponding numbers (operators) n, do not appear in the 
equations (444a) and (444 b).—This can be explained by the fact that 
if some mtermediate state r is occupied, the electron starting from the 
state g, say, is interchanged with the electron in the state r, which 
passes to the final state q’. 

The probability amplitude of such double transitions g > r > q’ with 
interchange must be the same as for double transitions without inter- 
change, since the electrons are indistinguishable. 

The expressions 


(¥ Vig) tag: Mag 

and (vv 91q) Mugg Mag’ 

which are a measure of the intensity of the scattered radiation with 
the frequency v-+v,,, are in agreement with the expressions (184) 
derived in § 23 for the probability of the double transitions which are 
responsible for the scattering. 

The preceding theory of the scattering process can be improved by 
taking account of the damping which is described by adding to the 
frequency v, of each state the imaginary term I). considered above. 
This correction becomes especially important in the neighbourhood of 
resonance. We thus get, for example, instead of (444a), 


ugg = — 1 todo Pe 


qq aa 2 2 : > 
ha viol, 


where I), = I,+-I) is the damping factor for the line emitted in the 
transition between the states g and r. This expression remains finite 
when v = »v,,, determining the polarization and intensity of the so-called 
‘resonance radiation’. 

The radiation theory sketched above is inexact in the sense that it does 
not take into account adequately the retarded character of the electro- 


§ 50 APPLICATION OF QUANTIZED ELECTRON WAVES 501 
magnetic actions. This has been done approximately by substituting 
the difference t— R/c for 1, where RB is the distance of some point (centre, 
say) of the atom from the point in question. This approximation does 
not hold, however. if the wave-length of the emitted or scattered light 
A = c/v is of the same order of magnitude as or smaller than the linear 
extension of the atom. The electromagnetic field generated by the 
latter can be determined in this case by the classical expressions for 
the scalar and the vector potential 


$(r,t) = | Art RS) ay 


A(r,t) = | nei dv’ 


where & = |r—r’| is the distance of the point considered from some 
point 7’ in the volume-element dV’ of the electron-cloud. Here « denotes 
the charge of the electron, while 


p= oly (446 a) 
is the density of the cloud and j the corresponding current density.t 
According to Schrédinger’s theory, the latter is given-by 


; (446) 


ees er 
=—y'u 
j= —y'up, (446 b) 
where u = pe WenEA is the operator of proper momentum, whereas 
wT 4 
according to Dirac’s theory j= 0S, (446 c) 


cy being the velocity matrix and y the operator corresponding to 
Dirac’s wave function. Substituting for the latter the expression (440), 
where x is an abbreviation for the geometrical coordinates and the 
spin-coordinate, we get 


p= SS ata, pt (aypa(a)ettnrnt 
== FE ata, y2" (w)yp9(aer*mrn, 


Before substituting these expressions in (446) we must replace a by 2’ 
(coordinates of the point r’) and ¢ by t’ = i— R/c. Now so long as RB is 
very large compared with the atomic dimensions we can put 

R= R,—n1’, 
where R, is the distance of the point r from the centre (nucleus) of the 


{t More exactly, the operators whose characteristic values are the probable values of 
the respective densities. 


502 SECOND QUANTIZATION § 50 
atom and n= R,/R, the unit vector pointing in the corresponding 
direction. We thus have 

p(r’, t’ ) _ p3 D2 ata A, 2* (3 )xb0(ar’ eter ¥nalt— Role +n-r'lc) 


and a similar expression for j(r’, ¢’). 

Replacing # in the denominator of the integrands in (446) by Ry— 
which is permissible so long as R, is supposed to be sufficiently large 
—we obtain the following expressions for the electromagnetic potentials, 


g(r, t) = e > > at a, et2mvrt—Rale)f 
Or 8 


(447) 
A(r,t) = = >. > ata, eitrrntt-Relog,, 
o r 8 


where fe = | pity 0 eiamrnarie dy’ 


B., = [ Wtypgetmmamrte dy’ 
The electric and magnetic field strengths can be calculated from (447) with 
the help of the classical equations 


E= -Vg-2 5 CAH =curlA, (448) 


(447 a) 


which give (if R, in the denominator of (447) is treated as a constant) 


1€ 
= 6 oe > at a, ef2rvrlt— Rly (nf,,—Br5)> 
cRy her 


v€ ; 
2 ele at a. ef27nlt- Rule) Qary (n x g,.,). (448 a) 
cR, pS 2 rs rs "8 


These expressions are easily seen to satisfy the relations 
H = nxE, E = —nxH 

characteristic of the classical radiation field. Indeed the only non- 
classical feature of the preceding equations besides the quantum fre- 
quencies v,, (which appear just as well in the old Schrédinger theory) 
is the non-commutative character of the coefficients a,. This feature 
becomes manifest, however, only when we pass to the calculation of 
the electromagnetic energy. 


51. Connexion between Quantized Mechanical (Electron) Waves 
and Electromagnetic Waves 

As we have already pointed out, in order to obtain a correct expres- 

sion for the energy (as well as for the other quadratic quantities) we 

must split up the linear parameters of the electromagnetic field ¢, A, 


§ 51 MECHANICAL AND ELECTROMAGNETIC WAVES 503 
E, H into two parts: ¢-, A~, E~, H- and ¢*, A+, E+, H+, corresponding 
to terms with negative and positive frequencies respectively. The 
energy density is then represented by the operator 


- _—(E*E-+H+H-). (449) 


In a similar way the energy stream (Poynting’s vector) is represented 


by the operator C 
oe 3, (Et x H-—Ht x E>). (449 a) 
‘77 


The negative and positive frequency terms of ¢, etc., should not be 
identified with the operators ¢ and ¢' which have been introduced in 
§ 49 with the help of the operators b, 6' of the Einstein-Bose statistics. 
In fact the electromagnetic waves we are now considering are not plane 
waves but spreading spherical waves, with amplitudes which vary as 
the reciprocal distance from the emitting atom and decrease expo- 
nentially with the time, the vibration (r,s) being in fact damped 
according to the law e~?74++I_ These damped spherical waves are, 
moreover, quantized in a way different from the plane waves considered 
before, namely, through the operators ala, and aja,, instead of the 
operators bt and 6 of the previous theory. 

It is interesting, however, to note that the operators of these two 
types are to some extent very similar. If r >s (i.e. W. > W,), then 
aia, obviously corresponds to b!, and ata, to b,, (in the sense that the 
former relate to harmonic terms with positive frequencies and the 
latter to terms with negative frequencies). Putting accordingly 


aia,=b% and ala,= b; 


rs? 


We. ged b7, 6%, = ata,ata = ala,a,al = n,(1—n,) 
64,6; = ala,ata, = ala,a,at = n,(1—n,) 
and consequently by, 67, —b%, 6, = Ny—N,. 


In the case of an emission due to the transition r > s the characteristic 
values of n, and n, after the transition are n, = 0 and n, = 1, so that 
the preceding expression reduces to 1, just like b, 6} —bi.b,. Ina similar 
way it can be shown that the aperaters ata, = b*, and ala, = b7, 
commute with each other unless 7’ 7 or 8’ #8 (if 7’ =71, then 
b;, b,—6;7,b7, = ala,), while 6%, always commutcs with bs. and 6,7, 
with b5,. 

These results seem to indicate that it is neither necessary nor possible 
to build up a theory of quantized clectromagnctic waves in empty space 
on the basis of the very restricted analogy between these waves and 


504 SECOND QUANTIZATION § 61 
the quantized waves representing the motion of ordinary particles which 
conform to the statistics of Einstein-Bose. The true relationship be- 
tween the electromagnetic waves and the quantized electron waves in 
three-dimensiona] space is probably much more adequately represented 
by the fact that the amplitudes of the former are quadratic in the 
amplitudes of the latter, the ‘symmetrical’ operators 6 being thus re- 
placed by quadratic combinations of the ‘antisymmetrical’ operators a. 

The theory of quantized electromagnetic waves developed in § 49 
must therefore be regarded as a convenient though artificial method 
for dealing with radiation problems involving ‘spontaneous’ transi- 
tions, rather than the truce picture of a physical reality. As a matter 
of fact, this method implies that the radiation emitted by an atom 
which is situated in a rectangular enclosure with reflecting walls 
is converted into plane standing waves, which represent the normal 
modes of electromagnetic vibrations consistent with the correspond- 
ing boundary conditions. Under such circumstances it is not neces- 
sary to consider the damped spherical electromagnetic waves which 
are emitted during the transition of the atom from one state to 
another, this transition along with the resulting change in the radiation 
field being described as a transition of the complete system: atom + 
radiation (in the form of normal vibrations) from one stationary state 
to another. It should be noted that this is exactly the same type of 
description as that used in the perturbation theory of ordinary transi- 
tions not involving any radiation effects: the transition is not investi- 
gated as a process with a definite course in time. it being simply assumed 
that this process brings the system from one unperturbed state to 
another. 

If we wished to consider the ‘spontaneous’ transition of the atom 
from a higher to a lower state as the result of its own radiation field, 
described by spherical waves, we should use a more complicated per- 
turbation method, involving damped vibrations, the transition appear- 
ing not as an instantaneous jump with a certain probability per unit 
time, but as a continuous process starting at t = 0 and ending at t = 0, 
with an effective duration of the order of 1/A. 

It should be mentioned further that from this point of view (which 
seems to be the really correct one) the electromagnetic radiation ought 
to be considered always in conjunction with the matter by which it 
is emitted, absorbed, or scattered. In fact the radiation enclosed in 
an empty vessel with perfectly reflecting walls and consideced as an 
independent dynamical system is merely a fiction, since its reflection 


§ 51 MECHANICAL AND ELECTROMAGNETIC WAVES 505 
by the walls is actually duc to the absorption and re-emission, or to 
the scattering, by the atoms constituting these walls. The absorption 
of radiation which, according to the method of quantized electro- 
magnetic waves in an enclosure, is simply a transition of the absorbing 
atom from a state of lower energy to that of a higher energy with the 
accompanying decrease of the energy of the corresponding electro- 
magnetic wave system by just one quantum, must be considered as the 
result of the superposition on the primary radiation, causing the transi- 
tion, of the secondary radiation emitted by the atom. This is the 
picture of the absorption process which is given by classical electro- 
dynamics, and it must remain fundamentally unchanged in a consistent 
quantum theory, where actual processes must only be replaced by 
probable ones. 

The current idea that the emission of radiation can be due only to 
a transition of the atom from a higher to a lower state is fundamentally 
wrong; the converse transition is just as well accompanied by emission 
of radiation, which, however, cuts down the primary radiation causing 
the transition, and is therefore manifested as the decrease—i.e. absorp- 
tion—of the latter. 

In the preceding discussion of the connexion between the quantized 
mechanical (electron) waves and the clectromagnetic waves, the former 
were dealt with as the cause of the latter. This relation can, however, 
be reversed in the sense that the motion of the electrons is influenced 
by electromagnetic waves of external origin. This influence has been 
actually examined already by the mcthod of the perturbation theory 
in the preceding section (in connexion with the scattering) and especially 
in §49. It remains to be seen whether the two types of quantization, 
assumed for the two kinds of waves, are consistent with each other in 
this respect. 

The expressions obtained in § 50 by the perturbation theory for the 
amplitudes a, which were supposed to have initially a characteristic 
value zero, must obviously satisfy the general commutation relations 
ata,+a,ai = 6,,, etc. Assuming for the sake of simplicity that all the 
coefficients a? but one vanish, we get, preserving the order of all the 
non-trivial factors involved, 


\Frgl? artbtb Heal 4 tao 


+ a number of harmonically oscillating terms which we shall leave 


aside, since their average value vanishes. 
3595.6 37 


ala, = mle 


506 SECOND QUANTIZATION § 51 
Now the products 6tb and bb‘, whether b and b' are defined as the 
amplitude-operators of the Bose-Einstein statistics or as the products 
of the type aj,a, and aja, (with suitably chosen values of p and ¢) 
commute with a? and a®'. We thus get 
t t 


and in a similar way 


1 bbt bth 
te gdiqot 2.1 (3 om N 
a,a, ag 4 rid ol lao orm 


We see from these equations that the relation afa,4-a,at = 1 will 
follow from the relation a?'a?+a°a?t = 1 only if it is assumed that 
bbt = btb, that is, if b and bt are treated as ordinary (commutable) 
numbers. As to the relations a!a,+a,a! = 0, etc., they are easily seen 
to hold (if r + 8); in fact, so long as oscillating terms are dropped, we 
get separately ala, = a,at = 0. 


52. The Quantum Electrodynamics of Heisenberg, Pauli, and 

Dirac. 
The absence of complete harmony between the mechanical and the 
electromagnetic waves from the point of view of their quantization is a 
very unsatisfactory feature of the preceding theory. It can be shown, 
however, to be due, at least to some extent, to the approximate form in 
which this theory has been developed hitherto. We shall now briefly 
consider its more exact formulation due to Heisenberg and Pauli. This 
formulation is at the same time a generalization, which treats the 
radiation field as but a special case of the electromagnetic field, pro- 
duced by matter and acting upon it, and includes ordinary electric and 
magnetic forces, treating them in the same way as radiation effects. 

The theory of Heisenberg and Pauli can be condensed into the 
following equations: 

1. The equation of motion 


[pteptcey-A+yomgc* hyp = 0, (450) 
where ¥ is Dirac’s one-column matrix with the four components 
Yr» Per Par Pe 

2. The equations of the electromagnetic field 
1 a 
ee ee t 
(v c2 =a)¢ mney 
, (451) 


§ 52 ELECTRODYNAMICS OF HEISENBERG, PAULI, AND DIRAC 507 
with the usual relations 


1a 
eas —Vp—- =A, H = curlA (451 a) 


between the potentials ¢, A and the field strengths E, H. 
3. The commutability equations expressing the quantization of the 
mechanical field according to the Pauli-Fermi statistics: 


p(x)yp"(x') +p" (a(x) = 8(a—z") 
p(x)yp(x") + Y(x' p(x) = 0 (452) 
PT (xl (a) +p" (x')p"(x) = 0 
4, The commutability equations for the electromagnetic field in 
empty space (i.e. in the absence of matter, see below): 


B,(2)A((2"}—Af2")B,(2) = — 225,322"), (453) 
A,(x)A,(x’)—Aj(x’) A; (x) = |. (453 a) 
By, (x) E(a’)— E(x’ )E,(x) = 0 


The equations (450) and (451) along with the quantum conditions 
(452) can be considered as a generalization of the equations (410) and 
(410 b) which have been established in § 46 as the exact equivalent of the 
Schrédinger theory of a system of electrons described by unquantized 
y-waves in the configuration space, and acting on each other according 
to Coulomb’s law. This generalization consists in the introduction of 
the finite velocity of propagation c of electromagnetic actions, both in 
an indirect way—by substituting the relativistic equation of motion 
(450) for the non-relativistic one (410), and in a direct way—by sub- 
stituting the equations (451) expressing the law of the retarded action 
for the Poisson equation (410b). 

The differential equations (451) can be replaced by the explicit 
expressions for the ‘retarded’ potentials 


g(r, j= ef pee dV'+ +4%(r, t) 


_— pf wat yyer't) an 
A(r,t) =e | EE av’ + Ar, t 


, (454) 


where ¢’ = t—|r—r’|/c; 6° and A® are arbitrary solutions of the homo- 
geneous d’Alembert equations 


vie i = 0, ViAS— = = = 0, (454 a) 


508 SECOND QUANTIZATION § 52 
satisfying the relation 
1 ag? 
div A°+4.— — = 0. (454 b) 
c at 
If we put ¢° = U0, A° = 0, that is, confine ourselves to the retarded 
potentials produced by the motion of the electrons which is described 
by the operator-function y, the action of an electron on itself which 
may secm to follow from these equations is actually eliminated auto- 
matically owing to the commutation relations (452). The equations 
(452), (450), and (454) (with ¢° = 0, A° = 0) must thus give the adequate 
description of the mutual action of the clectrons allowing for the re- 
lativity and retardation effects. 

The weak point of the Heisenberg-Pauli theory consists, as it seems, 
in the introduction of additional quantization rules for the electro- 
magnctic field expressed by the equations (453). These equations do 
not follow from the equations (451) in conjunction with (452), but are 
postulated on the basis of the analogy between the light waves and the 
mechanical waves which describe the motion of particles conforming to 
the Einstein-Bose statistics. In order to obtain the commutability 
relations for the vlectromagnetic field, Heisenberg and Pauli (following 
an earlier paper by Pauli and Jordan) actually come back to the old 
mechanical theory of light, considered as vibrations of an elastic ether, 
and give the quantum-mechanical theory of these vibrations, based on 
the classical wave equations (454a). It is indeed possible to write down 
the latter in a form corresponding to the ordinary Hamiltonian equa- 
tions of motion of a system of material points for the limiting case when 
these points constitute a continuous medium. Replacing the classical 
Hamiltonian equations of the motion of such a continuous medium by 
the corresponding matrix or wave-mechanical equations, one obtains 
the equations for the quantized elastic or electromagnetic waves. The 
photons corresponding to these waves are thus introduced in exactly 
the same way as the phonons, corresponding to ordinary sound waves 
(Part I). The energy of electromagnetic (or ‘elastic’) oscillations of a 
given frequency v is thus quantized according to the usual formula 
(n-+4)hv for the ordinary harmonic oscillator. In order to get rid of the 
4 it is necessary to modify the definition of the energy in the way 
shown in § 49 and § 50. 

It should be remembered that the above theory refers to the ‘free 
ether’, i.e. to empty space, without electric charges. This corresponds 
to the electromagnetic field which has been denoted above as ¢°, A®. 
Now such a field can be described, as is well known, without loss of 


§ 52 ELECTRODYNAMICS OF HEISENBERG, PAULI, AND DIRAC 509 

generality by putting ¢° = 0. Treating the components of the vector A® 

as the coordinates of the particles of an elastic ether described by the 

Lagrangian function L = } f (E?—H?) dV, one can define the electric 
0 

field E® = -: es as the quantity corresponding to the mechanical 

momentum of these particles. Hence we obtain the commutation 


relations (453), (453a) which are merely the ordinary commutation 
relations h 
Tey Qin — Pn Fyn’ =m ont Bu 8nn’ (k,l = 1,2, 3), 


etc., for a system of particles 1, 2,..., n,..., n’,... in the limiting case when 
these particles form a continuum. 

It should be mentioned that this field can be represented as a super- 
position of plane harmonic waves—as has already been done in§ 49. The 
commutation relations (453), (453 a) can be replaced accordingly by the 
relations 


Al(k)A,(k’)—A,(k’)A}(k) = —% 8,,,3(k—K’), (456) 
to which we must add the relation 
o'eyp()— ge 6) = + 2 5-1), (455) 


all other combinations being mutually commutable. These relations 
can be derived directly from the relations of §49 for the operators 
bt, 6 representing the amplitudes of the harmonic terms with positive 
and negative frequencies respectively for the limiting case of an en- 
closure with an infinite volume. 

In order to preserve the above commutation relations for the electro- 
magnetic field in the presence of electric charges (electrons) it is neces- 
sary to modify Maxwell’s equations by the addition of small terms 


1 a¢ 


proportional to the expression P, = divA ae a to its derivatives, 

replacing the condition (454b) by the additional commutation relation 
U U he U 

[Plz)b(2')—b@' )P(a)] = 8(r—r’), (455 b) 


where ¢ is the above-mentioned proportionality coefficient which in the 
final result is set equal to zero. 

It has been recently shown by Dirac that it is possible to give a 
somewhat different (relativistically invariant) formulation of the 
Heisenberg-Pauli theory for a system consisting of a given number of 
electrons or indeed of electrified particles of any kind. In Dirac’s theory 
the particles are described by the method of the configuration space, 


510 SECOND QUANTIZATION § 52 
and their mutual action is defined implicitly through their coupling 
with the quantized electromagnetic field in empty space in conjunction 
with a certain restrictive condition imposed on the wave function. 

Let (x1, t)3 Xe, ty; ... x,t) be the wave function of the particles (elec- 
trons) each considered with its own individual time, and let further 
$(7,t), A(xr,t) be the potentials of the quantized electromagnetic field, 
satisfying the equations (454 a), (454 b) and the commutation conditions 
(455). Dirac’s equations can then be written as follows: 

(4.4 o =)¥ a9. (456) 

where 


G 


Hy, = C4, b(2 45 ty) Hey, p74 EAC te | tr moe! (456 a) 
is the Hamiltonian for the th particle. 

The function % must be actually treated as a matrix with respect to 
the stationary states of the field taken alone. These states correspond 
to the different plane harmonic waves specified by the wave-number 
vector k and the polarization quantum number £. Associating these 
with photons, we can regard the above treatment as a particular case 
of the general method of treating incomplete systems, explained in 
Chap. VII, § 39, the ignored part (B) of the complete system being the 
‘photon gas’. 

It could be argued that it must be possible in this way to give an 
adequate description of the mutual action between the particles, inas- 
much as their mutual action with the photons [ignored in the equations 
(454 a)] is represented by the energy operators 

My = L$ (p by —¥ A(Ze ty) (456 b) 
(the operator cy; P;.+7x9 M9? corresponding to the energy of the kth 
particle taken alone). 

This is, however, not so, for the relation between matter and field is 
expressed not only by this operator M, describing the effect of the 
latter on the former, but also by the terms ey‘ and ety on the right 
side of the equations (451) which describe the effect of the matter on the 
field. It is obviously impossible to get rid of this side of their mutual 
relationship, and it must be introduced somehow, explicitly or implicitly, 
into the preceding theory in order.to transform it into a theory not only 
of the motion but also of the mutual action between all the particles 
concerned. This is done by Dirac in the following manner: 

Let us come back to the complete system: electrons 4- photons 


§ 52 ELECTRODYNAMICS OF HEISENBERG, PAULI, AND DIRAC 511 
(electromagnetic field), and let us consider % as a function both of the 
x,,t, of the former, and of the z, ¢ of the latter, it being understood 
that the system is doubly quantized with respect to the photons [which 
corresponds to the commutation relations (455)]. The equations (454 a), 
(454 b) will then be rewritten in the form 


| v"4—3 ot = U, | VA—<; al == 0, (457) 


Co oF ee? 
|aiva+. 4 == 0, (457 a) 


¢ and A being defined as certain operators acting on ¥. The latter equa- 
tion can be considered as a constraint to which the function ¢ is subject 

Now in order to describe the influence of matter on the electromagnetic 
field this equation must be replaced by the following generalized 
equation: 


[aiva+s lh = = > eh RX 4, (458) 


X, = 2%, Y,,2,,t, and A(X) is the so-called ‘invariant delta-function’ 
(introduced by Jordan and Pauli) 


AX) = “[8(r-+-et) —8(r—ct)] (4582) 


(it represents a spherical wave concentrated in an infinitely thin layer 
and travelling with the velocity of light from infinity so as to converge 
at the point r = 0 at t = 0 and then diverging again to infinity). Using 


the relations E = as) ee as H = curlA, one obtains accordingly, 
besides the equations 
curlE++ see =0, divH=0, 
which can be considered as [nee the equations 
( wi H— =F =[v> e,A(X—X,)|y, 
. (458 b) 
(div E)y = -35 S eg A(X—X | 
cL at H . 
Let us now put 4, = t, =... = ty =t = T, ie. introduce a common 


time for all the particles and for the field, and denote the corresponding 
complete derivative for any quantity f by csi so that 


alti ty, tg,... rty) ] =0= aaa lata > z| 


&=ti=T° 


512 SECOND QUANTIZATION § 52 
Then remembering the relations 


0A 0A a _ ad te 
ao atop Md y= Bed) 


and with the help of the formula 
lfoa 
| ae = —4 
“(3a | n8(2), 
we easily get, along with the trivial expressions 


E= —V$—- aA. H = curlA, 


the equations div Avi ®t = = 0, (459) 
and 
(cut H—- om = 4n| 3 ep ye5(r—T,)]¥ 
" (459 a) 
(div E)ys = 4r > e,5(r—Yr,) 
k=1 
which are equivalent to 
lo N. 
(v5— “33 aa) = —|4m bs e,3(t—T,) |, 
il = (459 b) 


1 2A = 


In the limit c = o these equations, together with the equations of 
motion (456), reduce to the ordinary Schrédinger equation for the 
N particles in the configuration space with the mutual potential energy 


U= De, neue ' corresponding to the Coulomb forces. 
53. Breit’s Formula. Concluding Remarks 

The theories of Heisenberg and Pauli and of Dirac have been 
hitherto in practice rather fruitless, that is, they have not led to any 
marked progress in the theory of the interaction of electrons. The only 
improvement over the simple interaction theory based on Coulomb’s 
law is represented by a formula originally derived by Breit from the 
general equations of Heisenberg and Pauli’s theory. Breit’s results 
amount to the following approximate expression for the mutual energy 


of two electrons . ; : 
W = or ei (460) 


where cy! and cy" are the respective velocity matrices of Dirac’s theory. 
This expression takes account of the electromagnetic (spin-orbit) and 


§ 53 BREIT’S FORMULA 513 
magnetic (spin-spin) interaction and also to some extent of the retarda- 
tion effects. It can be derived in a much simpler way without any use 
of the Heisenberg-Pauli-Dirac electrodynamics. The simplest and most 
straightforward of these derivations is the following one due to K. 
Nikolsky.f 

The energy of an electron in an external electromagnetic field speci- 
fied by the potentials ¢, A is given by the formula 

W = ep—ey-A. (461) 

Let us imagine that these potentials are due to the retarded action of 
a second electron moving classically. Their values at a given point and 
instant 7 can be expanded in Lagrange’s series in the form} 


(—1)4 oH ¥ 
P= 2 rice Be) 
a (462) 
— (—1)H au 
te pa(re2) 
; pick drH Cc 


where v is the velocity of the electron producing the field and r its 
distance from the other electron at the instant 7. It is natural to think 
that the quantum theory of the interaction can be obtained from the 
classical one by replacing the classical time derivatives 8¢/é7 by the 
quantum Poisson bracket expression [H, $] == (271/h)(H¢—¢H), where 
H is the Hamiltonian of the system formed by the two electrons without 
the interaction term W. The velocity vector v must naturally also be 
replaced by the matrix vector cy. We thus get 


on n 
oe zs (=) 2 (10% Hep, (463) 
ies J 
"  yl(n—v)t 
Here 7 corresponds to the common time 7' of the whole system, that 
is, of the two electrons (the electromagnetic field being no longer con- 
sidered as a dynamical system and playing an auxiliary role only) It 
is natural to define the corresponding energy H as the sum 
H = H'+H", (464) 
where H' and H™ are the Hamiltonians of the two electrons taken 
separately, i.e. 
H' = cy-p'-+y,myc?, HY = cy™p"+y? myc’. (464 a) 
+ Not yet published. Tho other dorivations are due to Méller, Rosenfeld, and Scherzer. 


} Cf. my Lehrbuch der Elektrodynamik, i, p. 184. 
3595.6 20 


where 


514 SECOND QUANTIZATION § 53 
It must not be supposed that this expression for H omits completely 
the mutual action of the electrons. In fact the operators p' and p"™ 
must be considered as representing the total momentum of the respective 
electrons, including the ‘potential momentum’ due to its partner, i.e. 
_ Mo e a Mo e? 
Pe aren ae PS oy Fee” 
(464 b) 
in agreement with the approximate theory of §38. This will become 
more apparent when we compare Breit’s formula with the result of the 
above theory. With the help of (461), (462), and (463) we get: 


= 2x)" ? 
W = Wru — 2s Cv [Hr Hl Fe v4 


+y'A’y"reHe-*}, (465) 


where H’ = > CA HA He-A. (465 a) 
A=0 

Dropping terms of the third and higher orders with respect to v/c 

(i.e. y), we have 


] I. aq,Il 
Wear @(7— TY) _ re (rH? —HrH +H?) 


whence, according to (465a), 


2, 
Wen = of) TY) 208 pH — HH — WH + HT) 
together with terms which are proportional to the square of H* or H", 
which we shall neglect as having no physical meaning (they represent 
the action of a point-like electron on itself). 
Now the expression in the brackets [ ] can be put in the form 


A" (H'r—rH')—(H'r—rH')H". 
Using the formulae (464 a) for H' and H™ we get 


and 


H"(H'r—rH")—(H'r—r)H™ = (=) [- (y" a r)], 


+ These terms are physically irrelevant also for another reason, namely, because the 
squares of the matrices y' and y” are equal to 1 (or rather to 3), whereas they must 
represent small quantities of the second order with respect to v'/c and v"/c. This diffi- 
culty has, however, an origin entirely different from the preceding one, being connected 
with the existence of states of negative proper energy. 


§ 53 CONCLUDING REMARKS 515 
which leads to Breit’s formula for W'", this expression being actually 
symmetrical with regard to the two electrons. 

The classical eaiiasen for W ee to Breit’s formula is 

e? [1 
W= ae sil Vieyi — Str v')(r- vy] (466) 
The second term must obviously represent the effect of the electro- 
magnetic interaction between the two clectrons with due account of the 
retardation. Now in the non-relativistic theory of § 38 where this 
retardation was left out of account, the electromagnetic interaction was 
shown to correspond to a mutual kinetic energy 
2 oy. vi, (466 a) 
cr 
which is quite different from the second term of (466). 

This difference is, however, greatly attenuated if we consider the 
total energy of the two electrons H+ W = H'+ H"-+ W, or more exactly 
the classical expression which corresponds to it and which is obtained 
if cy is replaced by v, y) by ./{]—(v/c)*}, and the p’s by the expressions 
(464b). We thus get 

i= a Re oe 3 iterate 


(v'/c) 


yi. 5; V1) + mech ff1—(o¥le) 


(eee —(wryejy™" 
Mo c2 4 mc? ies 
= Yi— crepe} ytd (otjeyyy 


and consequently 


HiW= 


— Vi-yl, 


Mo c? 
Fiona 


+— e |; vievil — 2 lead (466 b) 
2c? 

The first three terms in this expression represent the proper energy of 
the two electrons and their mutual potential energy, whereas the last 
one gives the energy of the electromagnetic interaction. Although still 
somewhat different from (466 a), it is, however, much more similar to it 
than the corresponding term of HW’. We obtain a still closer similarity 
if we average over all the directions of the vector r, considering them 
as equally probable. We thus get 


(r-v!)(r-v") = Jyleytty2, 


516 SECOND QUANTIZATION § 53 
which gives 
— I.qwil 

FW = et ice ts tsa 6686) 
The factor § appearing in the last (electromagnetic) term is the same 
as that which is met with in the calculation of the electron’s mass as 
due to the electromagnetic mutual action of the elements of its charge 
(supposed to be distributed in a spherically symmetrical way in a finite 
volume). 

The above derivation of Breit’s formula is not free from objection, 
especially with regard to the definition (464) of the energy H. It could 
be slightly modified by adding W to the cxpression used before (this 
would not alter the results to the approximation considered). The 
important point is that any symmetrization of the expression for H 
leads to cancelling terms of odd degree in the products of 7‘ and y". The 
same result is obtained if in the derivation of Lagrange’s series for the 
potentials ¢ and A we replace the retarded potentials by the mean 
value of the retarded and the accelerated ones. This symmetrization with 
respect to the time (which has been actually used for a similar purpose 
by Fokker) is equivalent to the symmetrization of the energy H with 
respect to the two electrons. This is natural since the time and the 
energy are dynamically conjugate quantities. 

We thus see incidentally that so long as we are using a symmetrical 
energy operator for two electrons, it is impossible to describe that part 
of their mutual action which is antisymmetrical in the two particles or 
in the time and which corresponds to the dissipation of energy by 
radiation. 

This reproach may not be applicable to the accurate form of the 
Heisenberg-Pauli-Dirac theory. This theory cannot be considered, 
however, as a satisfactory system of quantum electrodynamics for 
many other reasons. In the first place it is based on a fundamentally 
wrong interpretation of the relationship. between matter (electrons) 
and electromagnetic field (photons) as a formal analogy, the quantum 
theory of the electromagnetic field being developed accordingly as a 
wave-mechanical theory of the ‘ether’ in a somewhat disguised form 
adjusted to Maxwell’s equations, 

A second, more important, reason lies in the fact that material par- 
ticles are visualized as the primary things in Nature and are dealt with 
as unextended points with dynamical properties independent of those of 
the electromagnetic field, while the electromagnetic field is treated as 
but an auxiliary agent introduced for the description of their mutual 


§ 53 CONCLUDING REMARKS 517 
action and serving to determine their motion. It seems, however, more 
reasonable to think that the electromagnetic field is the primary and 
fundamental thing in Nature, the material particles (electrons and pro- 
tons) being derivable from it, and possessing no independent mechanical 
properties. This point of view corresponds to the latest development of 
the classical electrodynamics, culminating in the electromagnetic theory 
of mass. The mechanical momentum and energy—potential and kinetic 
—must be interpreted from this point of view as the approximate form 
of electromagnetic momentum and energy, directly connected not with 
the particles but with the electromagnetic field. The laws of motion 
can be derived accordingly from the principle of conservation of electro- 
magnetic momentum and energy, applied to separate electrons, if the 
latter are considered not as points but as extended bodies (spheres) and 
if the external force acting on them is supposed to be balanced by the 
‘inner’ force, due to their own motion. 

This classical theory which means the complete reduction of mechanics 
to electrodynamics has mct with one serious difficulty, connected with 
the problem of the spatial extension or ‘structure’ of the electron. It 
is responsible for the fact that the electromagnetic theory of mass, or, 
in other words, the clectromagnetic derivation of mechanics, has re- 
mained without further development until now. The advent of the 
quantum theory did not in the least alter the situation, the modern 
wave or quantum mechanics being simply a modified form of the old 
mechanics of a point-like particle with a given mass. 

Now it seems quite certain that this new theory is in principle just 
as wrong as the old one. and that the next task-in the development of 
our theory of the physical universe will consist in the application of the 
quantum ideas to the electromagnetic field in such a way as to obtain 
the mechanical laws as a corollary from the laws of conservation of 
electromagnetic energy and momentum. It is to be hoped that the 
main difficulty of the classical theory connected with the problem of 
the electron’s spatial extension will be eliminated by considering the 
electron as the product (and not the source) of the electromagnetic field, 
described in a consistent quantum way. One might, for example, define 
the electromagnetic field as a matrix from the point of view of the 
space-time manifold, i.e. as a matrix with the elements (x’|F |x”), where 
x is an abbreviation for x, y, z, t, the diagonal clements representing the 
probable values of the field at different points 2’ = 2”. The electron 
could be described accordingly with the help of a function D(j.c'—2"|/a), 
similar to a Gaussian function, with a finite parameter a playing the 


BeatriceGloria_personal library 


518 SECOND QUANTIZATION § 53 
role of the electron’s radius, |x’—x”| being the four-dimensional distance 
between the points x’ and x”. We are thus entitled to think that Dirac’s 
equation of motion will be replaced by an equation containing the 
electromagnetic momentum-energy tensor; the mass of the electron, 
instead of being introduced a priort as a parameter, being derivable 
from the quantum equivalent for its radius. A closer discussion of this 
question is, however, hardly possible at the present time. 


REFERENCES 
§ 4 


1. Approximate solution of Schrédinger’s equation based on the equation of 
Hamilton-Jacobi: 

G. WENTZEL, Zs. f. Phys. 38, 518 (1926). 

L. Brittourn, C. R. 183, 24 (1926). 

H. A. Kramers, Zs. f. Phys, 39, 828 (1926). 

H. A. Kramers u. G. P. Irrmann, Zs. f. Phys. 58, 217 (1929). 
2. The Virial Theorem: 

B. Finke stern, Zs. f. Phys. 50, 293 (1927). 

A. SOMMERFELD, Wellenmechanischer Erganzungsband, Kap. II, § 9. 
3. The motion of a wave packet: 

P. EnRENFEST, Zs. f. Phys. 45, 455 (1927). 


§5 
1. Theory of canonical transformations and of conditionally periodic motion: 
M. Born, Atommechanik, I, ch. ii. 
2. Connexion between classical and wave-mechanical average values: 
J.H. VAN VLECK, Proc. Nat. Ac. Sci. 14, 179 (1928). 


§ 7 
Born, HEISENBERG, u. JORDAN, Zs. f. Phys. 35, 557 (1926). 
P. A. M. Dirac, The Principles of Quantum Mechanica, ch. iii. 


§ 9 
E. ScHRODINGER, Ann. d. Phys. 79, 361 (1926). 
W. Gorpon, Zs. f. Phys. 40, 117 (1926). 


§ 10 


On the normalization of wave functions in the case of continuous spectra: 
E. Fues, Ann. d. Phys. 81, 281 (1926). 
P. A. M. Dirac, The Principles of Q, M., ch. iv. 


§ 11 


M. Born, W. HEISENBERG, u. P. JorDAN, Zs. f. Phys. 35, 557 (1926). 
E. ScHRODINGER, Ann. d. Phys. 79, 734 (1926). 


§ 12 


W. HEIsenBera, Ze. f. Phys. 33, 879 (1925). 
N. Bour, Uber die Quantentheorie der Linienspektra, Braunschweig, 1923. 


§ 13 


M. Born u. P. JorpDAN, Elementare Quantenmechanik. 
P. A. M. Dmac, The Principles of Q. M., §§ 29 and 30. 


520 REFERENCES 
CHAPTER IV (Transformation Theory) 


P. A. M. Drrac, The Principles of Q. M., ch. v. 
M. Born u. P. JonDAN, Elementare Quantenmechantk. 
J. v. NeumMANN, Mathematische Grundlagen der Quantenmechanik (1932); Gott. 
Nachr. 1927, p. 245. 
§19 
E. Scur6Opvincer, Abhandlungen zur Wellenmechanik, III, 8. 85. 
Born, HEISENBERG, u. JORDAN, Zs. f. Phys. 35, 557 (1926). 
H. Weyu, Gruppentheorie und Quuntenmechanik, § 16. 


§ 21 
P. A. M. Drrac, Proc. Roy. Soc. 112, 661 (1926). 


§ 25 


W. Gorpon, Zs. f. Phys. 40, 117 (1926). 
O. Kiem, Zs. f. Phys. 37, 895 (1926). 


§ 28 


P. A. M. Dirac, Proc. Roy. Soc. 117, 610 and 118, 351 (1928). 
C. G. Darwin, Proc. Roy. Soc. 118, 654 (1928). 
J. FRENKEL, Zs. f. Phys. 52, 356 (1928). 


g§ 29, 30 


W. Pauul, Zs. f. Phys. 43, 601 (1927). 
J. FRENKEL, Lehrbuch der Elektrodynamik, i. 294 and 353. 
L. H. Tuomas, Phil. Mag., June, 1927, and Nature, 107, 514 (1926). 


§ 31 
P. A. M. Dirac, The Principles of Q. M., §§ 74 and 76. 
G. Breit, Proc. Nat. Ac. Set. 14, 553 (1928). 
D. Ivanenxko, C. R., 25 Fob. 1929. 
V. Fock, Zs. f. Phys. 55, 127 (1929); 57, 261 (1929). 
W. Gorpon, Zs. f. Phys. 50, 630 (1927). 
§§ 32, 33 
P. A. M. Dirac, The Principles of Q. M., §§ 76, 77, 78. 
A. SOMMERFELD, Wellenmechanischer Ergdnzungsband. 
§ 35 
P. A. M. Dirac, The Principles of Q. M., § 75. 
O. Laporte and G. Uxtenserck, Phys. Rev. 37, 1380 (1931). 
H. WeEyt, Gruppentheorie und Quantenmechanik, § 39. 
§ 36 
V. Foor, Zs. f. Phys. 55, 127 (1929). 


REFERENCES 
§ 37 


E. SCHRGDINGER, Abhandlungen zur Wellenmechanik, II. 

V. Fock, Zs. f. Phys. 63, 855 (1930). 

H. Wryn, Gruppentheorie und Quantenmechanik. 

B. L. VAN DER WAERDEN, Die gruppentheoretische Methode in der Quanten- 
mechanik. 


nr 
NG 
-— 


§ 38 
W. Pautt, Zs. f. Phys. 43, 601 (1927). 
§ 41 


P. A. M. Dirac, Proc. Roy. Soc. 112, 661 (1926). 
The Principles of Q. M., ch. xi. 
E. Wicner, Zs. f. Phys. 40, 883 (1927). 

§ 42 


J.C. Srater, Phys. Rev. 34, 1293 (1929); 36, 57 (1930). 
W. Paunt, Rapport du Congres Solvay de 1930, i, § 4. 
P. A. M. Dirac, The Principles of Q. M., § 66. 


§ 43 
D. R. Hartrrer, Proc. Cam. Phil. Soc. 26, 85 (1928). 


§ 44 


V. Fock, Zs. f. Phys. 61, 126 (1930). 
Pp. A. M. Drrac, Proc. Cam. Phil. Soc. 26, 376 (1930). 


§ 45 


. Focr, Zs. f. Phys. 81, 195 (1933). 
. H. Tuomas, Proc. Cam. Phil. Soc. 23, 542 (1927). 
. FERMI, Zs. f. Phys. 48, 73 (1928). 


§ 46 
. JORDAN u. E. WIGNER, Zs. f. Phys. 47, 631 (1928). 
. Fock, Zs. f. Phys. 75, 622 (1932). 
§ 47 
. A. M. Dirac, Proc. Ioy. Soc. 114, 243 and 710 (1927). 
. JORDAN u. O. Kuen, Ze. f. Phys. 45, 751 (1927). 
. Fock, Zs. f. Phys. 75, 622 (1932). 
§ 48 
P. A. M. Dirac, The Principles of Q. M., ch. xin. 


§§ 49, 50 


W. HEISENBERG, Ann. d. Phys. 9, 338 (1931). 

V. Weisskorpr¥ u. E. Wicner, Zs. f. Phys. 63, 54 (1930); 65, 18 (1930). 
E. FERMI, Reviews of Modern Physics, 4, 87 (1932). 

G. Breit, Reviews of Modern Physics, 4, 504 (1932). 

$595.6 3x 


<4 mr < 


<n 


a 


to 


REFERENCES 
§ 52 


P. JoRDAN u. W. Paul, Zs. f. Phys. 47, 151 (1927). 

W. HEISENBERG u. W. Pavtt, Zs. f. Phys. 56, 1 (1927); 59, 168 (1930). 
L. Lanpau u. R. Prierts, Ze. f. Phys. 62, 188 (1930). 

P. A. M. Drrac. Proc. Roy. Soc. 136, 453 (1932). 

V. Fock and 13. Popousxky, Sow. Phys. 1, 80) (1932). 

Draac, Fock, and Povoisxy, Sow. Phys. 2, 468 (1932). 


§ 53 
G. Breit, Phys. Rev. 34, 553 (1929); 36, 383 (1930); 39, 616 (1932). 


Cur. MotiEr, Zs. f. Phys. 70, 786 (1931). 
L. RosENnFELD, Zs. f. Phys. 73, 253 (1932). 


INDEX TO PART I 


Absorption probability, 142. 
angular quantum number, 94. 
antisymmetric functions, 173. 
antisymmetry principle, 182. 
asymptotic solution, 79, 86. 
axivl quantum number, 94. 


Bloch, F., 265, 271. 

Bohr, N., 53, 87, 96. 
Boltzmann's law, 194, 196, 276. 
Born, M., 33, 36, 172 
Bose-Einstein statistics, 198. 
Bothe, W., 274, 278. 

Bragg’s formula, 25, 122. 
Brillouin, L., 209, 263. 

de Broglie, 20, 23, 55. 


Characteristic values, 87. 

compressibility (of metalx), 221. 

Compton effect, 48. 

Condon, F., 105. 

configuration space, 167. 

contact potential difference, 244. 

Crystal lattice, motion of electron in a, 
121, 227, 230. 

— —, energy levels in a, 230, 233. 


Damped vibrations, 105. 

Darwin, C. G., 46. 

Davisson, 25. 

Debye, P., 160. 

decay constant, 105. 

degeneracy, 83, 96, 129, 176. 
Dempster, 27. 

Dirac, 142, 153, 155, 160, 182, 198. 
Doppler effect, 277. 

Drude, 253. 


‘Eigenvibrations’, 64. 

Einstein, A., 6, 9-12, 36, 135, 147, 198, 
277. 

electrical conductivity of metals, 254. 

electromagnetic waves, 18, 103, 188, 
190. 

electron gas in metals, 215. 

emission probability, 135. 

exchange degeneracy, 176. 

exclusion principle, 162, 182. 


Fermi, E., 198, 235. 

Fizeau, 7, 23, 27, 30. 
fluctuations (in a gas), 197, 205. 
forced transitions, 139. 
Fourier’s theorem, 41. 


Fowler, R. H., 113, 243. 
frequency relation, 53, 131. 


CGamow, G., 105, 106. 
Germer, 25. 

Goudsmit, 152. 

group velocity, 29, 230. 
Gurney, 105. 


Harmonic oscillator, 78-84, 160. 

heat capacity of electron gas, 218. 

Heisenberg, W. (uncertainty relation), 
46, 47-52. 

Hermitian relation, 137. 

holes, Dirae’s, 155. 

Huygens, 5. 

hydrogen-like atom, 84, 108. 


Identity principle, 158. 
image force, 240. 


Kronig, 123. 


Laplace’s equation, 92. 
leaking, 117. 

Liouville’s theorem, 43, 184. 
Lorentz, 8, 256. 


Magnctic susceptibility of electron gas, 
220. 

matrix, 136. 

matrix multiplication law, 149. 

Maxwell, 7, 253. 

mean free path of clectron, 254, 257, 
261. 

momentum-energy vector, 11. 

multiplicity (of energy levels), 83. 


Newton, 1, 5, 23, 34. 


nodes, 57, 81, 88. 
Nordheim, 113, 243. 
normalizing condition, 126. 


Orthogonality relation, 130. 
overtones, 57. 


Pauli, W., 152, 162, 181, 198, 201, 220. 
Pauli-Fermi-Dirac statistics, 198, 215. 
Peierls, 265, 271. 

permutations, 176. 

perturbation energy, 141. 

phase-space, 42, 60, 158, 200. 
phonons, 265. 

photons, 12, 13, 135, 187, 272. 


524 INDEX TO PART I 


Planck, M., 13, 160, 272. Schrédinger’s equation, generalized, 
Poisson’s equation, 163, 234. 140. 

polarization of light, 150. — —, relativistic, 77. 

potential staircase method, 100. — —, for two particles, 170. 

principal quantum number, 94. Schrédinger’s theory of light emission, 


132. 


probability amplitudes, 62. 
selective reflection of electrons by a 


— conception, 32. 


— current, 69. crystal lattice, 122. 
— density, 32. self-consistent field, 234. 
— packet, 38. Sommerfeld, A., 54, 224, 242, 256. 


spherical harmonics, 92. 

spin, 152, 153, 183. 

spontancous transitions, 139. 
statistical equilibrium, ]99. 
symmetrical wave functions, 173. 
superposition principle, 61, 124. 


— theory, classical, 61. 
proper time, 10. 


Quanta of energy, 13. 
quantized states, 53. 

— waves, 161. 

quantitative quantization, 162. 


quantum numbers, 94. Tamm, Ig., 124. 


tensor of energy, 18. 

thermionic current, 242. . 
Thomas-Fermi equation, 235. 
Thomson, G. P., 26. 

transition probability, 133. 
transmission coefficient, 69. 
transposition, 177. 

tunnel effect, 112. 

Tyndall effect, 258. 


Uhlenbeck, 152. 


Radiation force, 138. 

Raman effect, 269. 

rectification of contacts, 250. 
reflection partial, 5, 68. 

— total, 5, 79, 86. 

refractive index, 3. 

relative motion of two particles, 171. 
relativistic Schrédinger equation, 77. 
relativity, 7, 9. 


SS SS SN 


rest-energy, 1}. : : 
reversibility law, 71, 144, 196, 207. uncertainty relation, 40, 50. 
senna velocity, imaginary, 67. 

upp, <9. —, group, 29, 230. 
Scattering of eloctrons, 258, 268. Wave equation of Schrédinger (see 
scattering of photons, 275. Schrédinger’s equation). 


Schrédinger, E., 30, 64, 75, 77, 140, 170. | wave number, 15. 
Schrédinger’s equation, for stationary , waves, standing, 56. 
states, 74. Wiedemann-Franz law, 255. 


INDEX TO PART II* 


Action variables, 42. 

adjoint matrix, 139. 
amplitudes quantized, 485. 
angle variables, 42. 

angular quantum number, 53. 
axial quantum number, 53. 


Basic quantities, 155. 

Bohr’s interpretation of spin, 329. 
Bohr’s magneton, 256. 

bracket expression (Poisson), 57, 161. 


Canonical equations, 40. 

— transformations, 41, 144, 159. 

central quadrics as representatives of 
quantum variables, 173. 

centroid, 32. 

characteristic functions, 54. 

— values, 54. 

class (of permutations), 404. 

commutable operetors, 56. 

commutation rules, 94. 

correspondence principle (Bohr’'s), 98. 

curvilinear coordinates (Dirac’s equa- 
tion in), 364. 

cyclic permutations, 404. 


Damping (radiation), 491, 497. 

degenerate systems (perturbation of), 
186. 

delta function (Dirac’s), 84. 

density matrix, 432. 

duplicity phenomenon, 275. 


Ehrenfost, P., 32, 360. 

electric moment of the electron, 300, 
316. 

exchange forces, 422, 443. 


Factorized wave functions, 424. 
Fermi, E., 446. 

Fock, V., 329, 424, 443. 
fundamental frequencies, 43. 


Gordon, 321. 
Goudsmit, 279, 304. 


Hamilton-Jacobi equation, 18, 25, 243. 
Hartree, D., 423. 

Heisenberg, W., 97, 491, 506. 
Hermitian property (of matrices), 86. 


Jordan, P., 453. 


Klein, O., 346. 


Lande’s factor, 309. 
Larmor’s angular velocity, 309. 
Lorentz transformation, 356. 


Matrix multiplication law, 90. 
— transposed, 139. 

—- unitary, 139. 

magnetic moment, 256, 275, 280. 
mixed matrix, 139, 441. 


Negative energy states, 277, 345. 
neutron, 348. 


Pauli, W., 280, 392, 417. 

— equations, 275. 

periodicity moduli, 38. 
permutations as operators, 404. 
photons (Dirac’s theory of), 477. 
positive clectrons, 347. 
p-ropresentation, 157. 


Quadruplicity phenomenon, 277, 345. 


Raman effect, 236, 499. 

relativity splitting (of spectral terms), 
338. 

relative degeneracy, 189. 


Scattering of light, 234, 484. 

—, Raman, 499. 

self-adjoint operators, 65. 

Slater, J. C., 413, 424. 

Sommerfeld’s fine structure formula, 
304. 

spin matrices of Pauli, 280. 

— — of Dirac, 316. 

spin of the electron, 294. 

— — asystem, 385, 414. 


Thomas, L. H., 446. 
Uhlenbeck, 273, 304. 


Variational form of wave equation, 424. 
Virial theorem, 31. 


Wigner’s operator, 453. 


Zeeman efiect normal, 257. 
— — anomalous, 307, 341. 


* This index does not make any claims at completeness, its purpose being simply to 
help the reader in locating the definitions of the main terms and some of the authors 


quoted. in the text. 


Rake: 


Sy 


